dynamo.pp.normalize

dynamo.pp.normalize(adata, layers='all', total_szfactor=None, splicing_total_layers=False, X_total_layers=False, keep_filtered=True, chunk_size=None, recalc_sz=False, sz_method='median', scale_to=None, transform_int_to_float=True)[source]

Normalize the gene expression value for the AnnData object.

This function is partly based on Monocle R package (https://github.com/cole-trapnell-lab/monocle3).

Parameters:
  • adata (AnnData) – an AnnData object.

  • layers (str) – the layer(s) to be normalized. Default is all, including RNA (X, raw) or spliced, unspliced, protein, etc.

  • total_szfactor (Optional[str]) – the column name in the .obs attribute that corresponds to the size factor for the total mRNA. Defaults to “total_Size_Factor”.

  • splicing_total_layers (bool) – whether to also normalize spliced / unspliced layers by size factor from total RNA. Defaults to False.

  • X_total_layers (bool) – whether to also normalize adata.X by size factor from total RNA. Defaults to False.

  • keep_filtered (bool) – whether we will only store feature genes in the adata object. If it is False, size factor will be recalculated only for the selected feature genes. Defaults to True.

  • chunk_size (Optional[int]) – the number of cells to be processed at a time. Defaults to None.

  • recalc_sz (bool) – whether we need to recalculate size factor based on selected genes before normalization. Defaults to False.

  • sz_method (Literal['mean-geometric-mean-total', 'geometric', 'median']) – the method used to calculate the expected total reads / UMI used in size factor calculation. Only mean-geometric-mean-total / geometric and median are supported. When mean-geometric-mean-total is used, size factors will be calculated using the geometric mean with given mean function. When median is used, locfunc will be replaced with np.nanmedian. When mean is used, locfunc will be replaced with np.nanmean. Defaults to “median”.

  • scale_to (Optional[float]) – the final total expression for each cell that will be scaled to. Defaults to None.

  • transform_int_to_float (bool) – whether to transform the adata.X from int to float32 for normalization. Defaults to True.

Return type:

AnnData

Returns:

An updated anndata object that are updated with normalized expression values for different layers.