dynamo.pl.topography

dynamo.pl.topography(adata, basis='umap', x=0, y=1, color='ntr', layer='X', highlights=None, labels=None, values=None, theme=None, cmap=None, color_key=None, color_key_cmap=None, background='white', ncols=4, pointsize=None, figsize=6, 4, show_legend='on data', use_smoothed=True, xlim=None, ylim=None, t=None, terms='streamline', 'fixed_points', init_cells=None, init_states=None, quiver_source='raw', fate='both', approx=False, quiver_size=None, quiver_length=None, density=1, linewidth=1, streamline_color=None, streamline_alpha=0.4, color_start_points=None, markersize=200, marker_cmap=None, save_show_or_return='show', save_kwargs={}, aggregate=None, show_arrowed_spines=False, ax=None, sort='raw', frontier=False, s_kwargs_dict={}, q_kwargs_dict={}, **streamline_kwargs_dict)[source]

Plot the streamline, fixed points (attractor / saddles), nullcline, separatrices of a recovered dynamic system for single cells. The plot is created on two dimensional space.

Topography function plots the full vector field topology including streamline, fixed points, characteristic lines. A key difference between dynamo and Velocyto or scVelo is that we learn a functional form of a vector field which can be used to predict cell fate over arbitrary time and space. On states near observed cells, it retrieves the key kinetics dynamics from the data observed and smoothes them. When time and state is far from your observed single cell RNA-seq datasets, the accuracy of prediction will decay. Vector field can be efficiently reconstructed in high dimension or lower pca/umap space. Since we learn a vector field function, we can plot the full vector via streamline on the entire domain as well as predicts cell fates by providing a set of initial cell states (via init_cells, init_states). The nullcline and separatrix provide topological information about the reconstructed vector field. By definition, the x/y-nullcline is a set of points in the phase plane so that dx/dt = 0 or dy/dt=0. Geometrically, these are the points where the vectors are either straight up or straight down. Algebraically, we find the x-nullcline by solving f(x,y) = 0. The boundary different different attractor basis is the separatrix because it separates the regions into different subregions with a specific behavior. To find them is a very difficult problem and separatrix calculated by dynamo requres manual inspection.

Here is more details on the fixed points drawn on the vector field: Fixed points are concepts introduced in dynamic systems theory. There are three types of fixed points: 1) repeller: a repelling state that only has outflows, which may correspond to a pluripotent cell state (ESC) that tends to differentiate into other cell states automatically or under small perturbation; 2) unstable fixed points or saddle points. Those states have attraction on some dimension (genes or reduced dimensions) but diverge in at least one other dimension. Saddle may correspond to progenitors, which are differentiated from ESC/pluripotent cells and relatively stable, but can further differentiate into multiple terminal cell types / states; 3) lastly, stable fixed points / cell type or attractors, which only have inflows and attract all cell states nearby, which may correspond to stable cell types and can only be kicked out of its cell state under extreme perturbation or in very rare situation. Fixed points are numbered with each number color coded. The mapping of the color of the number to the type of fixed point are: red: repellers; blue: saddle points; black: attractors. The scatter point itself also has filled color, which corresponds to confidence of the estimated fixed point. The lighter, the more confident or the fixed points are are closer to the sequenced single cells. Confidence of each fixed points can be used in conjunction with the Jacobian analysis for investigating regulatory network with spatiotemporal resolution.

By default, we plot a figure with three subplots , each colors cells either with potential, curl or divergence. potential is related to the intrinsic time, where a small potential is related to smaller intrinsic time and vice versa. Divergence can be used to indicate the state of each cell is in. Negative values correspond to potential sink while positive corresponds to potential source. https://en.wikipedia.org/wiki/Divergence. Curl may be related to cell cycle or other cycling cell dynamics. On 2d, negative values correspond to clockwise rotation while positive corresponds to anticlockwise rotation. https://www.khanacademy.org/math/multivariable-calculus/greens-theorem-and-stokes-theorem/formal-definitions-of-divergence-and-curl/a/defining-curl In conjunction with cell cycle score (dyn.pp.cell_cycle_scores), curl can be used to identify cells under active cell cycle progression.

Parameters
  • adata (AnnData) – an Annodata object

  • basis (str) – The reduced dimension.

  • x (int (default: 0)) – The column index of the low dimensional embedding for the x-axis.

  • y (int (default: 1)) – The column index of the low dimensional embedding for the y-axis.

  • color (string (default: ntr)) – Any column names or gene expression, etc. that will be used for coloring cells.

  • layer (str (default: X)) – The layer of data to use for the scatter plot.

  • highlights (list (default: None)) – Which color group will be highlighted. if highligts is a list of lists - each list is relate to each color element.

  • labels (array, shape (n_samples,) (optional, default None)) – An array of labels (assumed integer or categorical), one for each data sample. This will be used for coloring the points in the plot according to their label. Note that this option is mutually exclusive to the values option.

  • values (array, shape (n_samples,) (optional, default None)) – An array of values (assumed float or continuous), one for each sample. This will be used for coloring the points in the plot according to a colorscale associated to the total range of values. Note that this option is mutually exclusive to the labels option.

  • theme (string (optional, default None)) –

    A color theme to use for plotting. A small set of predefined themes are provided which have relatively good aesthetics. Available themes are:

    • ’blue’

    • ’red’

    • ’green’

    • ’inferno’

    • ’fire’

    • ’viridis’

    • ’darkblue’

    • ’darkred’

    • ’darkgreen’

  • cmap (string (optional, default 'Blues')) – The name of a matplotlib colormap to use for coloring or shading points. If no labels or values are passed this will be used for shading points according to density (largely only of relevance for very large datasets). If values are passed this will be used for shading according the value. Note that if theme is passed then this value will be overridden by the corresponding option of the theme.

  • color_key (dict or array, shape (n_categories) (optional, default None)) – A way to assign colors to categoricals. This can either be an explicit dict mapping labels to colors (as strings of form ‘#RRGGBB’), or an array like object providing one color for each distinct category being provided in labels. Either way this mapping will be used to color points according to the label. Note that if theme is passed then this value will be overridden by the corresponding option of the theme.

  • color_key_cmap (string (optional, default 'Spectral')) – The name of a matplotlib colormap to use for categorical coloring. If an explicit color_key is not given a color mapping for categories can be generated from the label list and selecting a matching list of colors from the given colormap. Note that if theme is passed then this value will be overridden by the corresponding option of the theme.

  • background (string or None (optional, default 'None`)) – The color of the background. Usually this will be either ‘white’ or ‘black’, but any color name will work. Ideally one wants to match this appropriately to the colors being used for points etc. This is one of the things that themes handle for you. Note that if theme is passed then this value will be overridden by the corresponding option of the theme.

  • ncols (int (optional, default 4)) – Number of columns for the figure.

  • pointsize (None or float (default: None)) – The scale of the point size. Actual point cell size is calculated as 500.0 / np.sqrt(adata.shape[0]) * pointsize

  • figsize (None or [float, float] (default: None)) – The width and height of a figure.

  • show_legend (bool (optional, default True)) – Whether to display a legend of the labels

  • use_smoothed (bool (optional, default True)) – Whether to use smoothed values (i.e. M_s / M_u instead of spliced / unspliced, etc.).

  • aggregate (str or None (default: None)) – The column in adata.obs that will be used to aggregate data points.

  • show_arrowed_spines (bool (optional, default False)) – Whether to show a pair of arrowed spines representing the basis of the scatter is currently using.

  • ax (Matplotlib Axis instance) – The matplotlib axes object where new plots will be added to. Only applicable to drawing a single component.

  • sort (str (optional, default raw)) – The method to reorder data so that high values points will be on top of background points. Can be one of {‘raw’, ‘abs’, ‘neg’}, i.e. sorted by raw data, sort by absolute values or sort by negative values.

  • save_show_or_return ({‘show’, ‘save’, ‘return’} (default: show)) – Whether to save, show or return the figure.

  • save_kwargs (dict (default: {})) – A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘scatter’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modify those keys according to your needs.

  • return_all (bool (default: False)) – Whether to return all the scatter related variables. Default is False.

  • add_gamma_fit (bool (default: False)) – Whether to add the line of the gamma fitting. This will automatically turn on if basis points to gene names and those genes have went through gamma fitting.

  • frontier (bool (default: False)) – Whether to add the frontier. Scatter plots can be enhanced by using transparency (alpha) in order to show area of high density and multiple scatter plots can be used to delineate a frontier. See matplotlib tips & tricks cheatsheet (https://github.com/matplotlib/cheatsheets). Originally inspired by figures from scEU-seq paper: https://science.sciencemag.org/content/367/6482/1151. If contour is set to be True, frontier will be ignored as contour also add an outlier for data points.

  • contour (bool (default: False)) – Whether to add an countor on top of scatter plots. We use tricontourf to plot contour for non-gridded data. The shapely package was used to create a polygon of the concave hull of the scatters. With the polygon we then check if the mean of the triangulated points are within the polygon and use this as our condition to form the mask to create the contour. We also add the polygon shape as a frontier of the data point (similar to when setting frontier = True). When the color of the data points is continuous, we will use the same cmap as for the scatter points by default, when color is categorical, no contour will be drawn but just the polygon. cmap can be set with ccmap argument. See below.

  • ccmap (str or None (default: None)) – The name of a matplotlib colormap to use for coloring or shading points the contour. See above.

  • calpha (float (default: 2.3)) – alpha value for identifying the alpha hull to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!

  • sym_c (bool (default: False)) – Whether do you want to make the limits of continuous color to be symmetric, normally this should be used for plotting velocity, jacobian, curl, divergence or other types of data with both positive or negative values.

  • smooth (bool or int (default: False)) – Whether do you want to further smooth data and how much smoothing do you want. If it is False, no smoothing will be applied. If True, smoothing based on one step diffusion of connectivity matrix (.uns[‘moment_cnn’] will be applied. If a number larger than 1, smoothing will based on `smooth steps of diffusion.

  • dpi (float, (default: 100.0)) – The resolution of the figure in dots-per-inch. Dots per inches (dpi) determines how many pixels the figure comprises. dpi is different from ppi or points per inches. Note that most elements like lines, markers, texts have a size given in points so you can convert the points to inches. Matplotlib figures use Points per inch (ppi) of 72. A line with thickness 1 point will be 1./72. inch wide. A text with fontsize 12 points will be 12./72. inch heigh. Of course if you change the figure size in inches, points will not change, so a larger figure in inches still has the same size of the elements.Changing the figure size is thus like taking a piece of paper of a different size. Doing so, would of course not change the width of the line drawn with the same pen. On the other hand, changing the dpi scales those elements. At 72 dpi, a line of 1 point size is one pixel strong. At 144 dpi, this line is 2 pixels strong. A larger dpi will therefore act like a magnifying glass. All elements are scaled by the magnifying power of the lens. see more details at answer 2 by @ImportanceOfBeingErnest: https://stackoverflow.com/questions/47633546/relationship-between-dpi-and-figure-size

  • inset_dict (dict (default: {})) – A dictionary of parameters in inset_ax. Example, something like {“width”: “5%”, “height”: “50%”, “loc”: ‘lower left’, “bbox_to_anchor”: (0.85, 0.90, 0.145, 0.145), “bbox_transform”: ax.transAxes, “borderpad”: 0} See more details at https://matplotlib.org/api/_as_gen/mpl_toolkits.axes_grid1.inset_locator.inset_axes.html or https://stackoverflow.com/questions/39803385/what-does-a-4-element-tuple-argument-for-bbox-to-anchor-mean-in-matplotlib

  • kwargs – Additional arguments passed to plt.scatters.

  • xlim (numpy.ndarray) – The range of x-coordinate

  • ylim (numpy.ndarray) – The range of y-coordinate

  • t (t_end: float (default 1)) – The length of the time period from which to predict cell state forward or backward over time. This is used by the odeint function.

  • terms (tuple (default: (‘streamline’, ‘fixed_points’))) –

    A tuple of plotting items to include in the final topography figure. (‘streamline’, ‘nullcline’, ‘fixed_points’,

    ’separatrix’, ‘trajectory’, ‘quiver’) are all the items that we can support.

  • init_cells (list (default: None)) – Cell name or indices of the initial cell states for the historical or future cell state prediction with numerical integration. If the names in init_cells are not find in the adata.obs_name, it will be treated as cell indices and must be integers.

  • init_state (numpy.ndarray (default: None)) – Initial cell states for the historical or future cell state prediction with numerical integration. It can be either a one-dimensional array or N x 2 dimension array. The init_state will be replaced to that defined by init_cells if init_cells are not None.

  • quiver_source (numpy.ndarray {‘raw’, ‘reconstructed’} (default: None)) – The data source that will be used to draw the quiver plot. If init_cells is provided, this will set to be the projected RNA velocity before vector field reconstruction automatically. If init_cells is not provided, this will set to be the velocity vectors calculated from the reconstructed vector field function automatically. If quiver_source is reconstructed, the velocity vectors calculated from the reconstructed vector field function will be used.

  • fate (str {“history”, ‘future’, ‘both’} (default: both)) – Predict the historial, future or both cell fates. This corresponds to integrating the trajectory in forward, backward or both directions defined by the reconstructed vector field function. default is ‘both’.

  • approx (bool (default: False)) – Whether to use streamplot to draw the integration line from the init_state.

  • quiver_size (float or None (default: None)) – The size of quiver. If None, we will use set quiver_size to be 1. Note that quiver quiver_size is used to calculate the head_width (10 x quiver_size), head_length (12 x quiver_size) and headaxislength (8 x quiver_size) of the quiver. This is done via the default_quiver_args function which also calculate the scale of the quiver (1 / quiver_length).

  • quiver_length (float or None (default: None)) – The length of quiver. The quiver length which will be used to calculate scale of quiver. Note that befoe applying default_quiver_args velocity values are first rescaled via the quiver_autoscaler function. Scale of quiver indicates the number of data units per arrow length unit, e.g., m/s per plot width; a smaller scale parameter makes the arrow longer.

  • density (float or None (default: 1)) – density of the plt.streamplot function.

  • linewidth (float or None (default: 1)) – multiplier of automatically calculated linewidth passed to the plt.streamplot function.

  • streamline_color (str or None (default: None)) – The color of the vector field stream lines.

  • streamline_alpha (float or None (default: 0.4)) – The alpha value applied to the vector field stream lines.

  • color_start_points (float or None (default: None)) – The color of the starting point that will be used to predict cell fates.

  • markersize (float (default: 200)) – The size of the marker.

  • marker_cmap (string (optional, default 'Blues')) – The name of a matplotlib colormap to use for coloring or shading the confidence of fixed points. If None, the default color map will set to be viridis (inferno) when the background is white (black).

  • save_show_or_return – Whether to save, show or return the figure.

  • save_kwargs – A dictionary that will passed to the save_fig function. By default it is an empty dictionary and the save_fig function will use the {“path”: None, “prefix”: ‘topography’, “dpi”: None, “ext”: ‘pdf’, “transparent”: True, “close”: True, “verbose”: True} as its parameters. Otherwise you can provide a dictionary that properly modify those keys according to your needs.

  • aggregate – The column in adata.obs that will be used to aggregate data points.

  • show_arrowed_spines – Whether to show a pair of arrowed spines representing the basis of the scatter is currently using.

  • ax – Axis on which to make the plot

  • frontier – Whether to add the frontier. Scatter plots can be enhanced by using transparency (alpha) in order to show area of high density and multiple scatter plots can be used to delineate a frontier. See matplotlib tips & tricks cheatsheet (https://github.com/matplotlib/cheatsheets). Originally inspired by figures from scEU-seq paper: https://science.sciencemag.org/content/367/6482/1151.

  • s_kwargs_dict (dict (default: {})) – The dictionary of the scatter arguments.

  • q_kwargs_dict – Additional parameters that will be passed to plt.quiver function

  • streamline_kwargs_dict – Additional parameters that will be passed to plt.streamline function

Returns

  • Plot the streamline, fixed points (attractors / saddles), nullcline, separatrices of a recovered dynamic system

  • for single cells or return the corresponding axis, depending on the plot argument.

See also:: pp.cell_cycle_scores()