dynamo.tl.top_n_markers

dynamo.tl.top_n_markers(adata, with_moran_i=False, group_by='test_group', sort_by='specificity', sort_order='decreasing', top_n_genes=5, exp_frac_thresh=0.1, log2_fc_thresh=None, qval_thresh=0.05, specificity_thresh=0.3, only_gene_list=False, display=True)[source]

Filter cluster deg (Moran’s I test) results and retrieve top markers for each cluster.

Parameters:
  • adata (AnnData) – An AnnData object.

  • with_moran_i (bool) – Whether to include Moran’s I test results for selecting top marker genes. Defaults to False.

  • group_by (str) – Column name or names to group by. Defaults to “test_group”.

  • sort_by (Union[str, List[str]]) – Column name or names to sort by. Defaults to “specificity”.

  • sort_order (Literal['increasing', 'decreasing']) – Whether to sort the data frame with increasing or decreasing order. Defaults to “decreasing”.

  • top_n_genes (int) – The number of top sorted markers. Defaults to 5.

  • exp_frac_thresh (float) – The minimum percentage of cells with expression for a gene to proceed selection of top markers. Defaults to 0.1.

  • log2_fc_thresh (Optional[float]) – The minimal threshold of log2 fold change for a gene to proceed selection of top markers. Applicable to none velocity, acceleration or curvature layers based DEGs. Defaults to None.

  • qval_thresh (float) – The maximal threshold of qval to be considered as top markers. Defaults to 0.05.

  • specificity_thresh (float) – The minimum threshold of specificity to be considered as top markers. Defaults to 0.3.

  • only_gene_list (bool) – Whether to only return the gene list for each cluster. Defaults to False.

  • display (bool) – Whether to print the data frame for the top marker genes after the filtering. Defaults to True.

Raises:

ValueError – Threshold too extreme that no genes passed the filter.

Return type:

Union[List[List[str]], DataFrame]

Returns:

If only_gene_list is false, a data frame that stores the top marker for each group would be returned. Otherwise, a list containing lists of genes in each cluster would be returned. In addition, it will display the data frame depending on whether display is set to be True.