dynamo.pp.normalize_cells

dynamo.pp.normalize_cells(adata, layers='all', total_szfactor='total_Size_Factor', splicing_total_layers=False, X_total_layers=False, norm_method=None, pseudo_expr=1, relative_expr=True, keep_filtered=True, recalc_sz=False, sz_method='median', scale_to=None, skip_log=False)

Normalize the gene expression value for the AnnData object.

This function is partly based on Monocle R package (https://github.com/cole-trapnell-lab/monocle3).

Parameters:
  • adata (AnnData) – an AnnData object.

  • layers (str) – the layer(s) to be normalized. Default is all, including RNA (X, raw) or spliced, unspliced, protein, etc.

  • total_szfactor (str) – the column name in the .obs attribute that corresponds to the size factor for the total mRNA. Defaults to “total_Size_Factor”.

  • splicing_total_layers (bool) – whether to also normalize spliced / unspliced layers by size factor from total RNA. Defaults to False.

  • X_total_layers (bool) – whether to also normalize adata.X by size factor from total RNA. Defaults to False.

  • norm_method (Optional[Callable]) – the method used to normalize data. Can be either function np.log1p, np.log2 or any other functions or string clr. By default, only .X will be size normalized and log1p transformed while data in other layers will only be size normalized. Defaults to None.

  • pseudo_expr (int) – a pseudocount added to the gene expression value before log/log2 normalization. Defaults to 1.

  • relative_expr (bool) – whether we need to divide gene expression values first by size factor before normalization. Defaults to True.

  • keep_filtered (bool) – whether we will only store feature genes in the adata object. If it is False, size factor will be recalculated only for the selected feature genes. Defaults to True.

  • recalc_sz (bool) – whether we need to recalculate size factor based on selected genes before normalization. Defaults to False.

  • sz_method (Literal['mean-geometric-mean-total', 'geometric', 'median']) – the method used to calculate the expected total reads / UMI used in size factor calculation. Only mean-geometric-mean-total / geometric and median are supported. When median is used, locfunc will be replaced with np.nanmedian. Defaults to “median”.

  • scale_to (Optional[float]) – the final total expression for each cell that will be scaled to. Defaults to None.

  • skip_log (bool) – whether skip log transformation. Defaults to False.

Return type:

AnnData

Returns:

An updated anndata object that are updated with normalized expression values for different layers.