dynamo.pp.select_genes_monocle

dynamo.pp.select_genes_monocle(adata, layer='X', keep_filtered=True, n_top_genes=2000, sort_by='cv_dispersion', exprs_frac_for_gene_exclusion=1, genes_to_exclude=None, SVRs_kwargs={})[source]

Select genes based on monocle recipe.

This version is here for modularization of preprocessing, so that users may try combinations of different preprocessing procedures in Preprocessor.

Parameters:

adata (AnnData) – an AnnData object.
layer (str) – The data from a particular layer (include X) used for feature selection. Defaults to “X”.
keep_filtered (bool) – Whether to keep genes that don’t pass the filtering in the adata object. Defaults to True.
n_top_genes (int) – the number of top genes based on scoring method (specified by sort_by) will be selected as feature genes. Defaults to 2000.
sort_by (Literal['gini', 'cv_dispersion', 'fano_dispersion']) – the sorting methods to be used to select genes. Should be one of the gini index or dispersion of coefficient variation or fano. Defaults to cv_dispersion.
exprs_frac_for_gene_exclusion (float) – threshold of fractions for high fraction genes. Defaults to 1.
genes_to_exclude (Optional[List[str]]) – genes that are excluded from evaluation. Defaults to None.
SVRs_kwargs (dict) – kwargs for SVRs. Defaults to {}.

Raises:

NotImplementedError – the ‘sort_by’ algorithm is invalid/unsupported.