dynamo.tl.cell_velocities

dynamo.tl.cell_velocities(adata, ekey=None, vkey=None, X=None, V=None, X_embedding=None, use_mnn=False, n_pca_components=None, transition_genes=None, min_r2=0.01, min_alpha=0.01, min_gamma=0.01, min_delta=0.01, basis='umap', neigh_key='neighbors', adj_key='distances', n_neighbors=30, method='pearson', neg_cells_trick=True, calc_rnd_vel=False, xy_grid_nums=50, 50, correct_density=True, scale=True, sample_fraction=None, random_seed=19491001, other_kernels_dict={}, enforce=False, key=None, preserve_len=False, **kmc_kwargs)[source]

Project high dimensional velocity vectors onto given low dimensional embeddings, and/or compute cell transition probabilities.

When method=’kmc, the Itô kernel is used which not only considers the correlation between the vector from any cell to its nearest neighbors and its velocity vector but also the corresponding distances. We expect this new kernel will enable us to visualize more intricate vector flow or steady states in low dimension. We also expect it will improve the calculation of the stationary distribution or source states of sampled cells. The original “correlation/cosine” velocity projection method is also supported. Kernels based on the reconstructed velocity field is also possible.

With the key argument, cell_velocities can be called by cell_accelerations or cell_curvature to calculate RNA acceleration/curvature vector for each cell.

Parameters
  • adata (AnnData) – an Annodata object.

  • ekey (str or None (optional, default None)) – The dictionary key that corresponds to the gene expression in the layer attribute. By default, ekey and vkey will be automatically detected from the adata object.

  • vkey (str or None (optional, default None)) – The dictionary key that corresponds to the estimated velocity values in the layers attribute.

  • X (ndarray or csr_matrix or None (optional, default None)) – The expression states of single cells (or expression states in reduced dimension, like pca, of single cells)

  • V (ndarray or csr_matrix or None (optional, default None)) – The RNA velocity of single cells (or velocity estimates projected to reduced dimension, like pca, of single cells). Note that X, V need to have the exact dimensionalities.

  • X_embedding (str or None (optional, default None)) – The low expression reduced space (pca, umap, tsne, etc.) of single cells that RNA velocity will be projected onto. Note X_embedding, X and V has to have the same cell/sample dimension and X_embedding should have less feature dimension comparing that of X or V.

  • use_mnn (bool (optional, default False)) – Whether to use mutual nearest neighbors for projecting the high dimensional velocity vectors. By default, we don’t use the mutual nearest neighbors. Mutual nearest neighbors are calculated from nearest neighbors across different layers, which which accounts for cases where, for example, the cells from spliced expression may be nearest neighbors but far from nearest neighbors on unspliced data. Using mnn assumes your data from different layers are reliable (otherwise it will destroy real signals).

  • n_pca_components (int (optional, default None)) – The number of pca components to project the high dimensional X, V before calculating transition matrix for velocity visualization. By default it is None and if method is kmc, n_pca_components will be reset to 30; otherwise use all high dimensional data for velocity projection.

  • transition_genes (str, list, or None (optional, default None)) – The set of genes used for projection of hign dimensional velocity vectors. If None, transition genes are determined based on the R2 of linear regression on phase planes. The argument can be either a dictionary key of .var, a list of gene names, or a list of booleans of length .n_vars.

  • min_r2 (float (optional, default 0.01)) – The minimal value of r-squared of the parameter fits for selecting transition genes.

  • min_alpha (float (optional, default 0.01)) – The minimal value of alpha kinetic parameter for selecting transition genes.

  • min_gamma (float (optional, default 0.01)) – The minimal value of gamma kinetic parameter for selecting transition genes.

  • min_delta (float (optional, default 0.01)) – The minimal value of delta kinetic parameter for selecting transition genes.

  • basis (int (optional, default umap)) – The dictionary key that corresponds to the reduced dimension in .obsm attribute.

  • neigh_key (str (optional, default neighbors)) – The dictionary key for the neighbor information (stores nearest neighbor indices) in .uns.

  • adj_key (str (optional, default distances)) – The dictionary key for the adjacency matrix of the nearest neighbor graph in .obsp.

  • method (str (optional, default pearson)) – The method to calculate the transition matrix and project high dimensional vector to low dimension, either kmc, cosine, pearson, or transform. “kmc” is our new approach to learn the transition matrix via diffusion approximation or an Itô kernel. “cosine” or “pearson” are the methods used in the original RNA velocity paper or the scvelo paper (Note that scVelo implementation actually centers both dX and V, so its cosine kernel is equivalent to pearson correlation kernel but we also provide the raw cosine kernel). “kmc” option is arguable better than “correlation” or “cosine” as it not only considers the correlation but also the distance of the nearest neighbors to the high dimensional velocity vector. Finally, the “transform” method uses umap’s transform method to transform new data points to the UMAP space. “transform” method is NOT recommended. Kernels that are based on the reconstructed vector field in high dimension is also possible.

  • neg_cells_trick (bool (optional, default True)) – Whether we should handle cells having negative correlations in gene expression difference with high dimensional velocity vector separately. This option was borrowed from scVelo package (https://github.com/theislab/scvelo) and use in conjunction with “pearson” and “cosine” kernel. Not required if method is set to be “kmc”.

  • calc_rnd_vel (bool (default: False)) – A logic flag to determine whether we will calculate the random velocity vectors which can be plotted downstream as a negative control and used to adjust the quiver scale of the velocity field.

  • xy_grid_nums (tuple (default: (50, 50))) – A tuple of number of grids on each dimension.

  • correct_density (bool (default: False)) – Whether to correct density when calculating the markov transition matrix, applicable to the kmc kernel.

  • correct_density – Whether to scale velocity when calculating the markov transition matrix, applicable to the kmc kernel.

  • sample_fraction (None or float (default: None)) – The downsampled fraction of kNN for the purpose of acceleration, applicable to the kmc kernel.

  • random_seed (int (default: 19491001)) – The random seed for numba to ensure consistency of the random velocity vectors. Default value 19491001 is a special day for those who care.

  • key (str or None (default: None)) – The prefix key that will be prefixed to the keys for storing calculated transition matrix, projection vectors, etc.

  • preserve_len (bool (default: False)) – Whether to preserve the length of high dimension vector length. When set to be True, the length of low dimension projected vector will be proportionally scaled to that of the high dimensional vector.

  • other_kernels_dict (dict (default: {})) – A dictionary of paramters that will be passed to the cosine/correlation kernel.

  • enforce (bool (default: False)) –

    Whether to enforce 1) redefining use_for_transition column in obs attribute; However this is NOT executed if

    the argument ‘transition_genes’ is not None.

    1. recalculation of the transition matrix.

Returns

adata – Returns an updated AnnData with projected velocity vectors, and a cell transition matrix calculated using either the Itô kernel method or similar methods from (La Manno et al. 2018).

Return type

AnnData