dynamo.pp.filter_cells_by_outliers
- dynamo.pp.filter_cells_by_outliers(adata, filter_bool=None, layer='all', keep_filtered=False, min_expr_genes_s=50, min_expr_genes_u=25, min_expr_genes_p=1, max_expr_genes_s=inf, max_expr_genes_u=inf, max_expr_genes_p=inf, max_pmito_s=None, shared_count=None, spliced_key='spliced', unspliced_key='unspliced', protein_key='protein', obs_store_key='pass_basic_filter')[source]
Select valid cells based on a collection of filters including spliced, unspliced and protein min/max vals.
- Parameters:
adata (
AnnData
) – an AnnData object.filter_bool (
Optional
[ndarray
]) – a boolean array from the user to select cells for downstream analysis. Defaults to None.layer (
str
) – the layer (include X) used for feature selection. Defaults to “all”.keep_filtered (
bool
) – whether to keep cells that don’t pass the filtering in the adata object. Defaults to False.min_expr_genes_s (
int
) – minimal number of genes with expression for a cell in the data from the spliced layer (also used for X). Defaults to 50.min_expr_genes_u (
int
) – minimal number of genes with expression for a cell in the data from the unspliced layer. Defaults to 25.min_expr_genes_p (
int
) – minimal number of genes with expression for a cell in the data from in the protein layer. Defaults to 1.max_expr_genes_s (
float
) – maximal number of genes with expression for a cell in the data from the spliced layer (also used for X). Defaults to np.inf.max_expr_genes_u (
float
) – maximal number of genes with expression for a cell in the data from the unspliced layer. Defaults to np.inf.max_expr_genes_p (
float
) – maximal number of protein with expression for a cell in the data from the protein layer. Defaults to np.inf.max_pmito_s (
Optional
[float
]) – maximal percentage of mitochondrial genes for a cell in the data from the spliced layer.shared_count (
Optional
[int
]) – the minimal shared number of counts for each cell across genes between layers. Defaults to None.spliced_key – name of the layer storing spliced data. Defaults to “spliced”.
unspliced_key – name of the layer storing unspliced data. Defaults to “unspliced”.
protein_key – name of the layer storing protein data. Defaults to “protein”.
obs_store_key – name of the layer to store the filtered data. Defaults to “pass_basic_filter”.
- Raises:
ValueError – the layer provided is invalid.
- Return type:
- Returns:
An updated AnnData object indicating the selection of cells for downstream analysis. adata will be subsetted with only the cells pass filtering if keep_filtered is set to be False.