dynamo.pp.filter_cells

dynamo.pp.filter_cells(adata, filter_bool=None, layer='all', keep_filtered=False, min_expr_genes_s=50, min_expr_genes_u=25, min_expr_genes_p=1, max_expr_genes_s=inf, max_expr_genes_u=inf, max_expr_genes_p=inf, shared_count=None, spliced_key='spliced', unspliced_key='unspliced', protein_key='protein', obs_store_key='pass_basic_filter')

Select valid cells based on a collection of filters including spliced, unspliced and protein min/max vals.

Parameters
  • adata (AnnData) – an AnnData object.

  • filter_bool (Optional[ndarray]) – a boolean array from the user to select cells for downstream analysis. Defaults to None.

  • layer (str) – the layer (include X) used for feature selection. Defaults to “all”.

  • keep_filtered (bool) – whether to keep cells that don’t pass the filtering in the adata object. Defaults to False.

  • min_expr_genes_s (int) – minimal number of genes with expression for a cell in the data from the spliced layer (also used for X). Defaults to 50.

  • min_expr_genes_u (int) – minimal number of genes with expression for a cell in the data from the unspliced layer. Defaults to 25.

  • min_expr_genes_p (int) – minimal number of genes with expression for a cell in the data from in the protein layer. Defaults to 1.

  • max_expr_genes_s (float) – maximal number of genes with expression for a cell in the data from the spliced layer (also used for X). Defaults to np.inf.

  • max_expr_genes_u (float) – maximal number of genes with expression for a cell in the data from the unspliced layer. Defaults to np.inf.

  • max_expr_genes_p (float) – maximal number of protein with expression for a cell in the data from the protein layer. Defaults to np.inf.

  • shared_count (Optional[int]) – the minimal shared number of counts for each cell across genes between layers. Defaults to None.

  • spliced_key – name of the layer storing spliced data. Defaults to “spliced”.

  • unspliced_key – name of the layer storing unspliced data. Defaults to “unspliced”.

  • protein_key – name of the layer storing protein data. Defaults to “protein”.

  • obs_store_key – name of the layer to store the filtered data. Defaults to “pass_basic_filter”.

Raises

ValueError – the layer provided is invalid.

Return type

AnnData

Returns

An updated AnnData object indicating the selection of cells for downstream analysis. adata will be subsetted with only the cells pass filtering if keep_filtered is set to be False.