Class 

standardize_adata(adata, tkey, experiment_type)[source]

Process the AnnData object to make it meet the standards of dynamo.

The index of the observations would be ensured to be unique. The layers with sparse matrix would be converted to compressed csr_matrix. DKM.allowed_layer_raw_names() will be used to define only_splicing, only_labeling and splicing_labeling keys. The genes would be renamed to their official name.

Parameters:

adata (AnnData) – an AnnData object.
tkey (str) – the key for time information (labeling time period for the cells) in .obs.
experiment_type (str) – the experiment type.

Return type:

preprocess_adata_seurat_wo_pca(adata, tkey=None, experiment_type=None)[source]

Preprocess the anndata object according to standard preprocessing in Seurat recipe without PCA. This can be used to test different dimension reduction methods.

Return type:: None

config_monocle_recipe(adata, n_top_genes=2000)[source]

Automatically configure the preprocessor for monocle recipe.

Parameters:

adata (AnnData) – an AnnData object.
n_top_genes (int) – Number of top feature genes to select in the preprocessing step. Defaults to 2000.

Return type:

preprocess_adata_monocle(adata, tkey=None, experiment_type=None)[source]

Preprocess the AnnData object based on Monocle style preprocessing recipe.

Parameters:

adata (AnnData) – an AnnData object.
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data.

Return type:

config_seurat_recipe(adata)[source]

Automatically configure the preprocessor for using the seurat style recipe.

Parameters:: adata (AnnData) – an AnnData object.
Return type:: None

preprocess_adata_seurat(adata, tkey=None, experiment_type=None)[source]

The preprocess pipeline in Seurat based on dispersion, implemented by dynamo authors.

Stuart and Butler et al. Comprehensive Integration of Single-Cell Data. Cell (2019) Butler et al. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol

Parameters:

adata (AnnData) – an AnnData object
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data.

Return type:

config_sctransform_recipe(adata)[source]

Automatically configure the preprocessor for using the sctransform style recipe.

Parameters:: adata (AnnData) – an AnnData object.
Return type:: None

preprocess_adata_sctransform(adata, tkey=None, experiment_type=None)[source]

Python implementation of https://github.com/satijalab/sctransform.

Hao and Hao et al. Integrated analysis of multimodal single-cell data. Cell (2021)

Parameters:

adata (AnnData) – an AnnData object
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data. Defaults to None.

Return type:

config_pearson_residuals_recipe(adata)[source]

Automatically configure the preprocessor for using the Pearson residuals style recipe.

Parameters:: adata (AnnData) – an AnnData object.
Return type:: None

preprocess_adata_pearson_residuals(adata, tkey=None, experiment_type=None)[source]

A pipeline proposed in Pearson residuals (Lause, Berens & Kobak, 2021).

Lause, J., Berens, P. & Kobak, D. Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data. Genome Biol 22, 258 (2021). https://doi.org/10.1186/s13059-021-02451-7

Parameters:

adata (AnnData) – an AnnData object
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data. Defaults to None.

Return type:

config_monocle_pearson_residuals_recipe(adata)[source]

Automatically configure the preprocessor for using the Monocle-Pearson-residuals style recipe.

Useful when you want to use Pearson residual to obtain feature genes and perform PCA but also using the standard size-factor normalization and log1p analyses to normalize data for RNA velocity and vector field analyses.

Parameters:: adata (AnnData) – an AnnData object.
Return type:: None

preprocess_adata_monocle_pearson_residuals(adata, tkey=None, experiment_type=None)[source]

A combined pipeline of monocle and pearson_residuals.

Results after running pearson_residuals can contain negative values, an undesired feature for later RNA velocity analysis. This function combine pearson_residual and monocle recipes so that it uses Pearson residual to obtain feature genes and perform PCA but also uses monocle recipe to generate X_spliced, X_unspliced, X_new, X_total or other data values for RNA velocity and downstream vector field analyses.

Parameters:

adata (AnnData) – an AnnData object
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data.

Return type:

preprocess_adata(adata, recipe='monocle', tkey=None, experiment_type=None)[source]

Preprocess the AnnData object with the recipe specified.

Parameters:

adata (AnnData) – An AnnData object.
recipe (Literal['monocle', 'seurat', 'sctransform', 'pearson_residuals', 'monocle_pearson_residuals']) – The recipe used to preprocess the data. Defaults to “monocle”.
tkey (Optional[str]) – the key for time information (labeling time period for the cells) in .obs. Defaults to None.
experiment_type (Optional[str]) – the experiment type of the data. If not provided, would be inferred from the data.

Raises:

NotImplementedError – the recipe is invalid.

Return type:

Estimation

Conventional scRNA-seq (est.csc)

class dynamo.est.csc.ss_estimation(U=None, Ul=None, S=None, Sl=None, P=None, US=None, S2=None, conn=None, t=None, ind_for_proteins=None, model='stochastic', est_method='gmm', experiment_type='deg', assumption_mRNA=None, assumption_protein='ss', concat_data=True, cores=1, **kwargs)

The class that estimates parameters with input data.

Parameters:

U (ndarray or sparse csr_matrix) – A matrix of unspliced mRNA count.
Ul (ndarray or sparse csr_matrix) – A matrix of unspliced, labeled mRNA count.
S (ndarray or sparse csr_matrix) – A matrix of spliced mRNA count.
Sl (ndarray or sparse csr_matrix) – A matrix of spliced, labeled mRNA count.
P (ndarray or sparse csr_matrix) – A matrix of protein count.
US (ndarray or sparse csr_matrix) – A matrix of second moment of unspliced/spliced gene expression count for conventional or NTR velocity.
S2 (ndarray or sparse csr_matrix) – A matrix of second moment of spliced gene expression count for conventional or NTR velocity.
conn (ndarray or sparse csr_matrix) – The connectivity matrix that can be used to calculate first /second moment of the data.
t (ss_estimation) – A vector of time points.
ind_for_proteins (ndarray) – A 1-D vector of the indices in the U, Ul, S, Sl layers that corresponds to the row name in the protein or X_protein key of .obsm attribute.
experiment_type (str) – labelling experiment type. Available options are: (1) ‘deg’: degradation experiment; (2) ‘kin’: synthesis experiment; (3) ‘one-shot’: one-shot kinetic experiment; (4) ‘mix_std_stm’: a mixed steady state and stimulation labeling experiment.
assumption_mRNA (str) – Parameter estimation assumption for mRNA. Available options are: (1) ‘ss’: pseudo steady state; (2) None: kinetic data with no assumption.
assumption_protein (str) – Parameter estimation assumption for protein. Available options are: (1) ‘ss’: pseudo steady state;
concat_data (bool (default: True)) – Whether to concatenate data
cores (int (default: 1)) – Number of cores to run the estimation. If cores is set to be > 1, multiprocessing will be used to parallel the parameter estimation.

Returns:

t (ss_estimation) – A vector of time points.
data (dict) – A dictionary with uu, ul, su, sl, p as its keys.
extyp (str) – labelling experiment type.
asspt_mRNA (str) – Parameter estimation assumption for mRNA.
asspt_prot (str) – Parameter estimation assumption for protein.
parameters (dict) –

A dictionary with alpha, beta, gamma, eta, delta as its keys.
alpha: transcription rate beta: RNA splicing rate gamma: spliced mRNA degradation rate eta: translation rate delta: protein degradation rate

fit(intercept=False, perc_left=None, perc_right=5, clusters=None, one_shot_method='combined')

Fit the input data to estimate all or a subset of the parameters

Parameters:

intercept (bool) – If using steady state assumption for fitting, then: True – the linear regression is performed with an unfixed intercept; False – the linear regression is performed with a fixed zero intercept.
perc_left (float (default: 5)) – The percentage of samples included in the linear regression in the left tail. If set to None, then all the samples are included.
perc_right (float (default: 5)) – The percentage of samples included in the linear regression in the right tail. If set to None, then all the samples are included.
clusters (list) – A list of n clusters, each element is a list of indices of the samples which belong to this cluster.

fit_gamma_steady_state(u, s, intercept=True, perc_left=None, perc_right=5, normalize=True)

Estimate gamma using linear regression based on the steady state assumption.

Parameters:

u (ndarray or sparse csr_matrix) – A matrix of unspliced mRNA counts. Dimension: genes x cells.
s (ndarray or sparse csr_matrix) – A matrix of spliced mRNA counts. Dimension: genes x cells.
intercept (bool) – If using steady state assumption for fitting, then: True – the linear regression is performed with an unfixed intercept; False – the linear regresssion is performed with a fixed zero intercept.
perc_left (float) – The percentage of samples included in the linear regression in the left tail. If set to None, then all the left samples are excluded.
perc_right (float) – The percentage of samples included in the linear regression in the right tail. If set to None, then all the samples are included.
normalize (bool) – Whether to first normalize the data.

Returns:

k: float: The slope of the linear regression model, which is gamma under the steady state assumption.
b: float: The intercept of the linear regression model.
r2: float: Coefficient of determination or r square for the extreme data points.
r2: float: Coefficient of determination or r square for the extreme data points.
all_r2: float: Coefficient of determination or r square for all data points.

fit_gamma_stochastic(est_method, u, s, us, ss, perc_left=None, perc_right=5, normalize=True)

Estimate gamma using GMM (generalized method of moments) or negbin distrubtion based on the steady state assumption.

Parameters:

est_method (str {gmm, negbin} The estimation method to be used when using the stochastic model.) –
- Available options when the model is ‘ss’ include:
(2) ‘gmm’: The new generalized methods of moments from us that is based on master equations, similar to the “moment” model in the excellent scVelo package; (3) ‘negbin’: The new method from us that models steady state RNA expression as a negative binomial distribution, also built upon on master equations. Note that all those methods require using extreme data points (except negbin, which use all data points) for estimation. Extreme data points are defined as the data from cells whose expression of unspliced / spliced or new / total RNA, etc. are in the top or bottom, 5%, for example. linear_regression only considers the mean of RNA species (based on the deterministic ordinary different equations) while moment based methods (gmm, negbin) considers both first moment (mean) and second moment (uncentered variance) of RNA species (based on the stochastic master equations). The above method are all (generalized) linear regression based method. In order to return estimated parameters (including RNA half-life), it additionally returns R-squared (either just for extreme data points or all data points) as well as the log-likelihood of the fitting, which will be used for transition matrix and velocity embedding. All est_method uses least square to estimate optimal parameters with latin cubic sampler for initial sampling.
u (ndarray or sparse csr_matrix) – A matrix of unspliced mRNA counts. Dimension: genes x cells.
s (ndarray or sparse csr_matrix) – A matrix of spliced mRNA counts. Dimension: genes x cells.
us (ndarray or sparse csr_matrix) – A matrix of unspliced mRNA counts. Dimension: genes x cells.
ss (ndarray or sparse csr_matrix) – A matrix of spliced mRNA counts. Dimension: genes x cells.
perc_left (float) – The percentage of samples included in the linear regression in the left tail. If set to None, then all the left samples are excluded.
perc_right (float) – The percentage of samples included in the linear regression in the right tail. If set to None, then all the samples are included.
normalize (bool) – Whether to first normalize the

Returns:

k: float: The slope of the linear regression model, which is gamma under the steady state assumption.
b: float: The intercept of the linear regression model.
r2: float: Coefficient of determination or r square for the extreme data points.
r2: float: Coefficient of determination or r square for the extreme data points.
all_r2: float: Coefficient of determination or r square for all data points.

fit_beta_gamma_lsq(t, U, S)

Estimate beta and gamma with the degradation data using the least squares method.

Parameters:

t (ndarray) – A vector of time points.
U (ndarray) – A matrix of unspliced mRNA counts. Dimension: genes x cells.
S (ndarray) – A matrix of spliced mRNA counts. Dimension: genes x cells.

Returns:

beta: ndarray: A vector of betas for all the genes.
gamma: ndarray: A vector of gammas for all the genes.
u0: float: Initial value of u.
s0: float: Initial value of s.

fit_gamma_nosplicing_lsq(t, L)

Estimate gamma with the degradation data using the least squares method when there is no splicing data.

Parameters:

t (ndarray) – A vector of time points.
L (ndarray) – A matrix of labeled mRNA counts. Dimension: genes x cells.

Returns:

gamma: ndarray: A vector of gammas for all the genes.
l0: float: The estimated value for the initial spliced, labeled mRNA count.

solve_alpha_mix_std_stm(t, ul, beta, clusters=None, alpha_time_dependent=True)

Estimate the steady state transcription rate and analytically calculate the stimulation transcription rate given beta and steady state alpha for a mixed steady state and stimulation labeling experiment.

This approach assumes the same constant beta or gamma for both steady state or stimulation period.

Parameters:

t (list or numpy.ndarray) – Time period for stimulation state labeling for each cell.
ul – A vector of labeled RNA amount in each cell.
beta (numpy.ndarray) – A list of splicing rate for genes.
clusters (list) – A list of n clusters, each element is a list of indices of the samples which belong to this cluster.
alpha_time_dependent (bool) – Whether or not to model the simulation alpha rate as a time dependent variable.

Returns:

alpha_std, alpha_stm: numpy.ndarray, numpy.ndarray: The constant steady state transcription rate (alpha_std) or time-dependent or time-independent (determined by alpha_time_dependent) transcription rate (alpha_stm)

fit_alpha_oneshot(t, U, beta, clusters=None)

Estimate alpha with the one-shot data.

Parameters:

t (float) – labelling duration.
U (ndarray) – A matrix of unspliced mRNA counts. Dimension: genes x cells.
beta (ndarray) – A vector of betas for all the genes.
clusters (list) – A list of n clusters, each element is a list of indices of the samples which belong to this cluster.

Returns:

alpha: ndarray: A numpy array with the dimension of n_genes x clusters.

concatenate_data()

Concatenate available data into a single matrix.

See “concat_time_series_matrices” for details.

get_n_genes(key=None, data=None): Get the number of genes.

set_parameter(name, value)

Set the value for the specified parameter.

Parameters:

name (string) – The name of the parameter. E.g. ‘beta’.
value (ndarray) – A vector of values for the parameter to be set to.

get_exist_data_names(): Get the names of all the data that are not ‘None’.

dynamo.est.csc.velocity: alias of <module ‘dynamo.estimation.csc.velocity’ from ‘/home/docs/checkouts/readthedocs.org/user_builds/dynamo-release/checkouts/latest/dynamo/estimation/csc/velocity.py’>

Time-resolved metabolic labeling based scRNA-seq (est.tsc)

Base class: a general estimation framework

class dynamo.est.tsc.kinetic_estimation(param_ranges, x0_ranges, simulator)

A general parameter estimation framework for all types of time-seris data.

Initialize the kinetic_estimation class.

Parameters:

param_ranges (ndarray) – A n-by-2 numpy array containing the lower and upper ranges of n parameters (and initial conditions if not fixed).
x0_ranges (ndarray) – Lower and upper bounds for initial conditions for the integrators. To fix a parameter, set its lower and upper bounds to the same value.
simulator (LinearODE) – An instance of python class which solves ODEs. It should have properties ‘t’ (k time points, 1d numpy array), ‘x0’ (initial conditions for m species, 1d numpy array), and ‘x’ (solution, k-by-m array), as well as two functions: integrate (numerical integration), solve (analytical method).

Returns:

An instance of the kinetic_estimation class.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

Deterministic models via analytical solution of ODEs

class dynamo.est.tsc.Estimation_DeterministicDeg(beta=None, gamma=None, x0=None)

An estimation class for degradation (with splicing) experiments. Order of species: <unspliced>, <spliced>

Initialize the Estimation_DeterministicDeg object.

Parameters:

beta (Optional[ndarray]) – The splicing rate.
gamma (Optional[ndarray]) – The degradation rate.
x0 (Optional[ndarray]) – The initial conditions.

Returns:

An instance of the Estimation_DeterministicDeg class.

auto_fit(time, x_data, **kwargs)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (ndarray) – A matrix representing RNA data.
kwargs – The additional keyword arguments.

Return type:

Returns:

The optimized parameters and the cost.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

calc_half_life(key)

Calculate half-life of a parameter.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The half-life value.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

get_param(key)

Get the estimated parameter value by key.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The estimated parameter value.

guestimate_gamma(x_data, time)

Roughly estimate initial conditions for parameter estimation.

Parameters:

x_data (ndarray) – A matrix representing RNA data.
time (ndarray) – A matrix of time information.

Return type:

Returns:

Estimated gamma.

guestimate_init_cond(x_data)

Roughly estimate initial conditions for parameter estimation.

Parameters:: x_data (ndarray) – A matrix representing RNA data.
Return type:: ndarray
Returns:: Estimated initial conditions.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_DeterministicDegNosp(gamma=None, x0=None)

An estimation class for degradation (without splicing) experiments.

Initialize the Estimation_DeterministicDegNosp object.

Parameters:

gamma (Optional[ndarray]) – The degradation rate.
x0 (Optional[ndarray]) – The initial conditions.

Returns:

An instance of the Estimation_DeterministicDegNosp class.

auto_fit(time, x_data, sample_method='lhs', method=None, normalize=False)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (ndarray) – A matrix representing RNA data.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize the data.

Return type:

Returns:

The optimized parameters and the cost.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

calc_half_life(key)

Calculate half-life of a parameter.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The half-life value.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

get_param(key)

Get the estimated parameter value by key.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The estimated parameter value.

guestimate_gamma(x_data, time)

Roughly estimate initial conditions for parameter estimation.

Parameters:

x_data (ndarray) – A matrix representing RNA data.
time (ndarray) – A matrix of time information.

Return type:

Returns:

Estimated gamma.

guestimate_init_cond(x_data)

Roughly estimate initial conditions for parameter estimation.

Parameters:: x_data (ndarray) – A matrix representing RNA data.
Return type:: ndarray
Returns:: Estimated initial conditions.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_DeterministicKinNosp(alpha, gamma, x0=0)

An estimation class for kinetics (without splicing) experiments with the deterministic model. Order of species: <unspliced>, <spliced>

Initialize the Estimation_DeterministicKinNosp object.

Parameters:

alpha (ndarray) – Transcription rate.
gamma (ndarray) – Degradation rate.
x0 (Union[int, ndarray]) – The initial condition.

Returns:

An instance of the Estimation_DeterministicKinNosp class.

get_alpha()

Get the transcription rate.

Return type:: ndarray
Returns:: The transcription rate.

get_gamma()

Get the degradation rate.

Return type:: ndarray
Returns:: The degradation rate.

calc_half_life(key)

Calculate the half life.

Return type:: ndarray

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_DeterministicKin(alpha, beta, gamma, x0=array([0., 0.]))

An estimation class for kinetics experiments with the deterministic model. Order of species: <unspliced>, <spliced>

Initialize the Estimation_DeterministicKin object.

Parameters:

alpha (ndarray) – Transcription rate.
beta (ndarray) – Splicing rate.
gamma (ndarray) – Degradation rate.
x0 (Union[int, ndarray]) – The initial condition.

Returns:

An instance of the Estimation_DeterministicKin class.

get_alpha()

Get the transcription rate.

Return type:: ndarray
Returns:: The transcription rate.

get_beta()

Get the splicing rate.

Return type:: ndarray
Returns:: The splicing rate.

get_gamma()

Get the degradation rate.

Return type:: ndarray
Returns:: The degradation rate.

calc_spl_half_life()

Calculate the half life of splicing.

Return type:: ndarray
Returns:: The half life of splicing.

calc_deg_half_life()

Calculate the half life of degradation.

Return type:: ndarray
Returns:: The half life of degradation.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

Stochastic models via matrix form of moment equations

class dynamo.est.tsc.Estimation_MomentDeg(beta=None, gamma=None, x0=None, include_cov=True)

An estimation class for degradation (with splicing) experiments. Order of species: <unspliced>, <spliced>, <uu>, <ss>, <us> Order of parameters: beta, gamma

Initialize the Estimation_MomentDeg object.

Parameters:

beta (Optional[ndarray]) – The splicing rate.
gamma (Optional[ndarray]) – The degradation rate.
x0 (Optional[ndarray]) – The initial conditions.
include_cov (bool) – Whether to consider covariance when estimating.

Returns:

An instance of the Estimation_MomentDeg class.

extract_data_from_simulator()

Get corresponding data from the LinearODE class.

Return type:: ndarray

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

auto_fit(time, x_data, **kwargs)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (ndarray) – A matrix representing RNA data.
kwargs – The additional keyword arguments.

Return type:

Returns:

The optimized parameters and the cost.

calc_half_life(key)

Calculate half-life of a parameter.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The half-life value.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

get_param(key)

Get the estimated parameter value by key.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The estimated parameter value.

guestimate_gamma(x_data, time)

Roughly estimate initial conditions for parameter estimation.

Parameters:

x_data (ndarray) – A matrix representing RNA data.
time (ndarray) – A matrix of time information.

Return type:

Returns:

Estimated gamma.

guestimate_init_cond(x_data)

Roughly estimate initial conditions for parameter estimation.

Parameters:: x_data (ndarray) – A matrix representing RNA data.
Return type:: ndarray
Returns:: Estimated initial conditions.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_MomentDegNosp(gamma=None, x0=None)

An estimation class for degradation (without splicing) experiments. Order of species: <r>, <rr>.

Initialize the Estimation_MomentDeg object.

Parameters:

gamma (Optional[ndarray]) – The degradation rate.
x0 (Optional[ndarray]) – The initial conditions.

Returns:

An instance of the Estimation_MomentDeg class.

auto_fit(time, x_data, sample_method='lhs', method=None, normalize=False)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (ndarray) – A matrix representing RNA data.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize the data.

Return type:

Returns:

The optimized parameters and the cost.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

calc_half_life(key)

Calculate half-life of a parameter.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The half-life value.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

get_param(key)

Get the estimated parameter value by key.

Parameters:: key (str) – The key of parameter.
Return type:: ndarray
Returns:: The estimated parameter value.

guestimate_gamma(x_data, time)

Roughly estimate initial conditions for parameter estimation.

Parameters:

x_data (ndarray) – A matrix representing RNA data.
time (ndarray) – A matrix of time information.

Return type:

Returns:

Estimated gamma.

guestimate_init_cond(x_data)

Roughly estimate initial conditions for parameter estimation.

Parameters:: x_data (ndarray) – A matrix representing RNA data.
Return type:: ndarray
Returns:: Estimated initial conditions.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_MomentKin(a, b, alpha_a, alpha_i, beta, gamma, include_cov=True)

An estimation class for kinetics experiments. Order of species: <unspliced>, <spliced>, <uu>, <ss>, <us>

Initialize the Estimation_MomentKin object.

Parameters:

a (ndarray) – Switching rate from active promoter state to inactive promoter state.
b (ndarray) – Switching rate from inactive promoter state to active promoter state.
alpha_a (ndarray) – Transcription rate for active promoter.
alpha_i (ndarray) – Transcription rate for inactive promoter.
beta (ndarray) – Splicing rate.
gamma (ndarray) – Degradation rate.
include_cov (bool) – Whether to include the covariance when estimating.

Returns:

An instance of the Estimation_MomentKin class.

extract_data_from_simulator()

Get corresponding data from the LinearODE class.

Return type:: ndarray
Returns:: The variable from ODE simulator as an array.

get_alpha_a()

Get the transcription rate for active promoter.

Return type:: ndarray
Returns:: The transcription rate for active promoter.

get_alpha_i()

Get the transcription rate for inactive promoter.

Return type:: ndarray
Returns:: The transcription rate for inactive promoter.

get_alpha()

Get all transcription rates.

Return type:: ndarray
Returns:: All transcription rates.

get_beta()

Get the splicing rate.

Return type:: ndarray
Returns:: The splicing rate.

get_gamma()

Get the degradation rate.

Return type:: ndarray
Returns:: The degradation rate.

calc_spl_half_life()

Calculate the half life of splicing.

Return type:: ndarray
Returns:: The half life of splicing.

calc_deg_half_life()

Calculate the half life of degradation.

Return type:: ndarray
Returns:: The half life of degradation.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Estimation_MomentKinNosp(a, b, alpha_a, alpha_i, gamma)

An estimation class for kinetics experiments. Order of species: <r>, <rr>

Initialize the Estimation_MomentKinNosp object.

Parameters:

a (ndarray) – Switching rate from active promoter state to inactive promoter state.
b (ndarray) – Switching rate from inactive promoter state to active promoter state.
alpha_a (ndarray) – Transcription rate for active promoter.
alpha_i (ndarray) – Transcription rate for inactive promoter.
gamma (ndarray) – Degradation rate.

Returns:

An instance of the Estimation_MomentKinNosp class.

extract_data_from_simulator()

Get corresponding data from the LinearODE class.

Return type:: ndarray
Returns:: The variable from ODE simulator as an array.

get_alpha_a()

Get the transcription rate for active promoter.

Return type:: ndarray
Returns:: The transcription rate for active promoter.

get_alpha_i()

Get the transcription rate for inactive promoter.

Return type:: ndarray
Returns:: The transcription rate for inactive promoter.

get_alpha()

Get all transcription rates.

Return type:: ndarray
Returns:: All transcription rates.

get_gamma()

Get the degradation rate.

Return type:: ndarray
Returns:: The degradation rate.

calc_deg_half_life()

Calculate the half life of degradation.

Return type:: ndarray
Returns:: The half life of degradation.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model name, kinetic parameters, and initial conditions.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_model(reinstantiate=True)

Export the simulator model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model class (default: True).
Return type:: LinearODE
Returns:: Exported simulator model.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

Mixture models for kinetic / degradation experiments

class dynamo.est.tsc.Lambda_NoSwitching(model1, model2, alpha=None, lambd=None, gamma=None, x0=None, beta=None)

An estimation class with the mixture model. If beta is None, it is assumed that the data does not have the splicing process.

Initialize the Lambda_NoSwitching object.

Parameters:

model1 (LinearODE) – The first model to mix.
model2 (LinearODE) – The second model to mix.
alpha (Optional[ndarray]) – Transcription rate.
lambd (Optional[ndarray]) – The lambd value.
gamma (Optional[ndarray]) – Degradation rate.
x0 (Union[int, ndarray, None]) – The initial condition.
beta (Optional[ndarray]) – Splicing rate.

Returns:

An instance of the Lambda_NoSwitching class.

auto_fit(time, x_data, **kwargs)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (Union[csr_matrix, ndarray]) – A matrix representing RNA data.
kwargs – The additional keyword arguments.

Return type:

Returns:

The optimized parameters and the cost.

export_model(reinstantiate=True)

Export the mixture model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model.
Return type:: Union[LambdaModels_NoSwitching, LinearODE]
Returns:: MixtureModels or LinearODE.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model nameS, kinetic parameters, and initial conditions.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

export_x0()

Export optimized initial conditions for the mixture of models analysis.

Return type:: Optional[ndarray]
Returns:: Exported initial conditions.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

normalize_deg_data(x_data, weight)

Normalize the degradation data while preserving the relative proportions between species. It calculates scaling factors to ensure the data’s range remains within a certain limit.

Parameters:

x_data (Union[csr_matrix, ndarray]) – A matrix representing RNA data.
weight (Union[float, int]) – Weight for scaling.

Return type:

Returns:

A tuple containing the normalized degradation data and the scaling factor.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

class dynamo.est.tsc.Mixture_KinDeg_NoSwitching(model1, model2, alpha=None, gamma=None, x0=None, beta=None)

An estimation class with the mixture model. If beta is None, it is assumed that the data does not have the splicing process.

Initialize the Mixture_KinDeg_NoSwitching object.

Parameters:

model1 (LinearODE) – The first model to mix.
model2 (LinearODE) – The second model to mix.
alpha (Optional[ndarray]) – Transcription rate.
gamma (Optional[ndarray]) – Degradation rate.
x0 (Union[int, ndarray, None]) – The initial condition.
beta (Optional[ndarray]) – Splicing rate.

Returns:

An instance of the Mixture_KinDeg_NoSwitching class.

normalize_deg_data(x_data, weight)

Normalize the degradation data while preserving the relative proportions between species. It calculates scaling factors to ensure the data’s range remains within a certain limit.

Parameters:

x_data (Union[csr_matrix, ndarray]) – A matrix representing RNA data.
weight (Union[float, int]) – Weight for scaling.

Return type:

Returns:

A tuple containing the normalized degradation data and the scaling factor.

auto_fit(time, x_data, alpha_min=0.1, beta_min=50, gamma_min=10, kin_weight=2, use_p0=True, **kwargs)

Estimate the parameters.

Parameters:

time (ndarray) – The time information.
x_data (Union[csr_matrix, ndarray]) – A matrix representing RNA data.
alpha_min (Union[float, int]) – The minimum limitation on transcription rate.
beta_min (Union[float, int]) – The minimum limitation on splicing rate.
gamma_min (Union[float, int]) – The minimum limitation on degradation rate.
kin_weight (Union[float, int]) – Weight for scaling during normalization.
use_p0 (bool) – Whether to use initial parameters when estimating.
kwargs – The additional keyword arguments.

Return type:

Returns:

The optimized parameters and the cost.

export_model(reinstantiate=True)

Export the mixture model.

Parameters:: reinstantiate (bool) – Whether to reinstantiate the model.
Return type:: Union[MixtureModels, LinearODE]
Returns:: MixtureModels or LinearODE.

export_x0()

Export optimized initial conditions for the mixture of models analysis.

Return type:: Optional[ndarray]
Returns:: Exported initial conditions.

export_dictionary()

Export parameter estimation results as a dictionary.

Return type:: Dict
Returns:: Dictionary containing model nameS, kinetic parameters, and initial conditions.

assemble_kin_params(unfixed_params)

Assemble the kinetic parameters array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled kinetic parameters.

assemble_x0(unfixed_params)

Assemble the initial conditions array.

Parameters:: unfixed_params (ndarray) – Array of unfixed parameters.
Return type:: ndarray
Returns:: The assembled initial conditions.

export_parameters()

Export the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

extract_data_from_simulator(t=None, **kwargs)

Extract data from the ODE simulator.

Parameters:

t (Optional[ndarray]) – The time information. If provided, the data will be integrated with time information.
kwargs – Additional keyword arguments.

Return type:

Returns:

The variable from ODE simulator.

f_lsq(params, t, x_data, method=None, normalize=True)

Calculate the difference between simulated and observed data for least squares fitting.

Parameters:

params (ndarray) – Array of parameters for the simulation.
t (ndarray) – Array of time values.
x_data (ndarray) – The input array.
method (Optional[str]) – Method for integration.
normalize (bool) – Whether to normalize data.

Return type:

Returns:

Residuals representing the differences between simulated and observed data (flattened).

fit_lsq(t, x_data, p0=None, n_p0=1, bounds=None, sample_method='lhs', method=None, normalize=True)

Fit time-seris data using the least squares method.

This method iteratively optimizes the parameters for different initial conditions (p0) and returns the optimized parameters and associated cost.

Parameters:

t (ndarray) – A numpy array of n time points.
x_data (ndarray) – An m-by-n numpy array of m species, each having n values for the n time points.
p0 (Optional[ndarray]) – Initial guesses of parameters. If None, a random number is generated within the bounds.
n_p0 (int) – Number of initial guesses.
bounds (Optional[Tuple[Union[float, int], Union[float, int]]]) – Lower and upper bounds for parameters.
sample_method (str) – Method used for sampling initial guesses of parameters: lhs: Latin hypercube sampling; uniform: Uniform random sampling.
method (Optional[str]) – Method used for solving ODEs. See options in simulator classes.
normalize (bool) – Whether to normalize values in x_data across species, so that large values do not dominate the optimizer.

Return type:

Returns:

Optimal parameters and the cost function evaluated at the optimum.

get_SSE()

Get the sum of squared errors (SSE) from the least squares fitting.

Return type:: float
Returns:: Sum of squared errors (SSE).

get_bound(axis)

Get the bounds of the specified axis for all parameters.

Parameters:: axis (int) – The index of axis.
Return type:: ndarray
Returns:: An array containing the bounds of the specified axis for all parameters.

get_opt_kin_params()

Get the optimized kinetic parameters.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized kinetic parameters, or None if not available.

get_opt_x0_params()

Get the optimized initial conditions.

Return type:: Optional[ndarray]
Returns:: Array containing the optimized initial conditions, or None if not available.

normalize_data(X)

Perform log1p normalization on the data.

Parameters:: X (ndarray) – Target data to normalize.
Return type:: ndarray
Returns:: The normalized data.

sample_p0(samples=1, method='lhs')

Sample the initial parameters with either Latin Hypercube Sampling or random method.

Parameters:

samples (int) – The number of samples.
method (str) – The sampling method. Only support “lhs” and random sampling.

Return type:

Returns:

The sampled array.

set_params(params)

Set the parameters of the simulator using assembled kinetic parameters.

Parameters:: params (ndarray) – Array of assembled kinetic parameters.
Return type:: None

test_chi2(t, x_data, species=None, method='matrix', normalize=True)

Perform a Pearson’s chi-square test. The statistics is computed as: sum_i (O_i - E_i)^2 / E_i, where O_i is the data and E_i is the model predication.

The data can be either:

stratified moments: ‘t’ is an array of k distinct time points, ‘x_data’ is an m-by-k matrix of data,
where m is the number of species.

Or

raw data: ‘t’ is an array of k time points for k cells, ‘x_data’ is an m-by-k matrix of data, where m is
the number of species. Note that if the method is ‘numerical’, t has to monotonically increasing.

If not all species are included in the data, use ‘species’ to specify the species of interest.

Return type:: Tuple[float, float, int]
Returns:: The p-value of a one-tailed chi-square test, the chi-square statistics and degree of freedom.

Vector field

Vector field class

class dynamo.vf.SvcVectorField(X=None, V=None, Grid=None, normalize=None, *args, **kwargs)[source]

Initialize the VectorField class.

Parameters:

X (Optional[ndarray]) – (dimension: n_obs x n_features), Original data.
V (Optional[ndarray]) – (dimension: n_obs x n_features), Velocities of cells in the same order and dimension of X.
Grid (Optional[ndarray]) – The function that returns diffusion matrix which can be dependent on the variables (for example, genes)
normalize (Optional[str]) – Logic flag to determine whether to normalize the data to have zero means and unit covariance. This is often required for raw dataset (for example, raw UMI counts and RNA velocity values in high dimension). But it is normally not required for low dimensional embeddings by PCA or other non-linear dimension reduction methods.
M – int (default: None) The number of basis functions to approximate the vector field. By default it is calculated as min(len(X), int(1500 * np.log(len(X)) / (np.log(len(X)) + np.log(100)))). So that any datasets with less than about 900 data points (cells) will use full data for vector field reconstruction while any dataset larger than that will at most use 1500 data points.
a – float (default 5) Parameter of the model of outliers. We assume the outliers obey uniform distribution, and the volume of outlier’s variation space is a.
beta – float (default: None) Parameter of Gaussian Kernel, k(x, y) = exp(-beta*||x-y||^2). If None, a rule-of-thumb bandwidth will be computed automatically.
ecr – float (default: 1e-5) The minimum limitation of energy change rate in the iteration process.
gamma – float (default: 0.9) Percentage of inliers in the samples. This is an inital value for EM iteration, and it is not important. Default value is 0.9.
lambda – float (default: 3) Represents the trade-off between the goodness of data fit and regularization.
minP – float (default: 1e-5) The posterior probability Matrix P may be singular for matrix inversion. We set the minimum value of P as minP.
MaxIter – int (default: 500) Maximum iteration times.
theta – float (default 0.75) Define how could be an inlier. If the posterior probability of a sample is an inlier is larger than theta, then it is regarded as an inlier.
div_cur_free_kernels – bool (default: False) A logic flag to determine whether the divergence-free or curl-free kernels will be used for learning the vector field.
sigma – int Bandwidth parameter.
eta – int Combination coefficient for the divergence-free or the curl-free kernels.
seed – int or 1-d array_like, optional (default: 0) Seed for RandomState. Must be convertible to 32 bit unsigned integers. Used in sampling control points. Default is to be 0 for ensure consistency between different runs.

train(**kwargs)[source]

Learn an function of vector field from sparse single cell samples in the entire space robustly. Reference: Regularized vector field learning with sparse approximation for mismatch removal, Ma, Jiayi, etc. al, Pattern Recognition

Parameters:: normalize – Logic flag to determine whether to normalize the data to have zero means and unit covariance. This is often required for raw dataset (for example, raw UMI counts and RNA velocity values in high dimension). But it is normally not required for low dimensional embeddings by PCA or other non-linear dimension reduction methods.
Return type:: VecFldDict
Returns:: A dictionary which contains X, Y, beta, V, C, P, VFCIndex. Where V = f(X), P is the posterior probability and VFCIndex is the indexes of inliers which found by VFC.

get_Jacobian(method='analytical', input_vector_convention='row', **kwargs)[source]

Get the Jacobian of the vector field function. If method is ‘analytical’: The analytical Jacobian will be returned and it always take row vectors as input no matter what input_vector_convention is.

If method is ‘numerical’: If the input_vector_convention is ‘row’, it means that fjac takes row vectors as input, otherwise the input should be an array of column vectors. Note that the returned Jacobian would behave exactly the same if the input is an 1d array.

The column vector convention is slightly faster than the row vector convention. So the matrix of row vector convention is converted into column vector convention under the hood.

Return type:: ndarray

No matter the method and input vector convention, the returned Jacobian is of the following format:

df_1/dx_1 df_1/dx_2 df_1/dx_3 … df_2/dx_1 df_2/dx_2 df_2/dx_3 … df_3/dx_1 df_3/dx_2 df_3/dx_3 … … … … …

get_Hessian(method='analytical', **kwargs)[source]

Get the Hessian of the vector field function. If method is ‘analytical’: The analytical Hessian will be returned and it always take row vectors as input no matter what input_vector_convention is.

Return type:: ndarray

No matter the method and input vector convention, the returned Hessian is of the following format:

df^2/dx_1^2 df_1^2/(dx_1 dx_2) df_1^2/(dx_1 dx_3) … df^2/(dx_2 dx_1) df^2/dx_2^2 df^2/(dx_2 dx_3) … df^2/(dx_3 dx_1) df^2/(dx_3 dx_2) df^2/dx_3^2 … … … … …

get_Laplacian(method='analytical', **kwargs)[source]

Get the Laplacian of the vector field. Laplacian is defined as the sum of the diagonal of the Hessian matrix. Because Hessian is originally defined for scalar function and here we extend it to vector functions. We will calculate the summation of the diagonal of each output (target) dimension.

A Laplacian filter is an edge detector used to compute the second derivatives of an image, measuring the rate at which the first derivatives change (so it is the derivative of the Jacobian). This determines if a change in adjacent pixel values is from an edge or continuous progression.

Return type:: ndarray

evaluate(CorrectIndex, VFCIndex, siz)[source]

Evaluate the precision, recall, corrRate of the sparseVFC algorithm.

Parameters:

CorrectIndex (List) – Ground truth indexes of the correct vector field samples.
VFCIndex (List) – Indexes of the correct vector field samples learned by VFC.
siz (int) – Number of initial matches.

Return type:

Tuple[float, float, float]

Returns:

A tuple of precision, recall, corrRate, where Precision, recall, corrRate are Precision and recall of VFC, percentage of initial correct matches, respectively.

Predictions

Least action path

class dynamo.pd.GeneTrajectory(adata, X=None, t=None, X_pca=None, PCs='PCs', mean='pca_mean', genes='use_for_pca', expr_func=None, **kwargs)[source]

Class for handling gene expression trajectory data.

Initializes a GeneTrajectory object.

Parameters:

adata (AnnData) – Anndata object containing the gene expression data.
X (Optional[ndarray]) – The gene expression data as a numpy array of shape (n, d). Defaults to None.
t (Optional[ndarray]) – The time data as a numpy array of shape (n,). Defaults to None.
X_pca (Optional[ndarray]) – The PCA-transformed gene expression data as a numpy array of shape (n, d). Defaults to None.
PCs (str) – The key in adata.uns to use for the PCA components. Defaults to “PCs”.
mean (str) – The key in adata.uns to use for the PCA mean. Defaults to “pca_mean”.
genes (str) – The key in adata.var to use for the genes. Defaults to “use_for_pca”.
expr_func (Optional[Callable]) – A function to transform the PCA-transformed gene expression data back to the original space. Defaults to None.
**kwargs – Additional keyword arguments to be passed to the superclass initializer.

from_pca(X_pca, t=None)[source]

Converts PCA-transformed gene expression data to gene expression data.

Parameters:

X_pca (ndarray) – The PCA-transformed gene expression data as a numpy array of shape (n, d).
t (Optional[ndarray]) – The time data as a numpy array of shape (n,). Defaults to None.

Return type:

to_pca(x=None)[source]

Converts gene expression data to PCA-transformed gene expression data.

Parameters:: x (Optional[ndarray]) – The gene expression data as a numpy array of shape (n, d). Defaults to None.
Return type:: ndarray
Returns:: The PCA-transformed gene expression data as a numpy array of shape (n, d).

genes_to_mask()[source]

Returns a boolean mask for the genes in the trajectory.

Return type:: ndarray
Returns:: A boolean mask for the genes in the trajectory.

calc_msd(save_key='traj_msd', **kwargs)[source]

Calculate the mean squared displacement (MSD) of the gene expression trajectory.

Parameters:

save_key (str) – The key to save the MSD data to in adata.var. Defaults to “traj_msd”.
**kwargs – Additional keyword arguments to be passed to the superclass method.

Return type:

Returns:

The mean squared displacement of the gene expression trajectory.

save(save_key='gene_trajectory')[source]

Save the gene expression trajectory to adata.var.

Parameters:: save_key (str) – The key to save the gene expression trajectory to in adata.var. Defaults to “gene_trajectory”.
Return type:: None

select_gene(genes, arr=None, axis=None)[source]

Selects the gene expression data for the specified genes.

Parameters:

genes (Union[ndarray, list]) – The genes to select the expression data for.
arr (Optional[ndarray]) – The array to select the genes from. Defaults to None.
axis (Optional[int]) – The axis to select the genes along. Defaults to None.

Return type:

Returns:

The gene expression data for the specified genes.

archlength_sampling(sol, interpolation_num, integration_direction)

Sample the curve using archlength sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

calc_arclength()

Calculate the arc length of the trajectory.

Return type:: float
Returns:: arc length of the trajectory

calc_curvature()

Calculate the curvature of the trajectory.

Return type:: ndarray
Returns:: curvature of the trajectory, shape (n_points,)

calc_tangent(normalize=True)

Calculate the tangent vectors of the trajectory.

Parameters:: normalize (bool) – whether to normalize the tangent vectors. Defaults to True.
Returns:: tangent vectors of the trajectory, shape (n_points-1, n_dimensions)

dim()

Returns the number of dimensions in the trajectory.

Return type:: int
Returns:: number of dimensions in the trajectory

integrate(func)

Calculate the integral of a function along the curve.

Parameters:: func (Callable) – A function to integrate along the curve.
Return type:: ndarray
Returns:: The integral of func along the discrete curve.

interp_X(num=100, **interp_kwargs)

Interpolates the curve at num equally spaced points in t.

Parameters:

num (int) – The number of points to interpolate the curve at.
**interp_kwargs – Additional keyword arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated curve at num equally spaced points in t.

interp_t(num=100)

Interpolates the t parameter linearly.

Parameters:: num (int) – Number of interpolation points.
Return type:: ndarray
Returns:: The array of interpolated t values.

interpolate(t, **interp_kwargs)

Interpolate the curve at new time values.

Parameters:

t (ndarray) – The new time values at which to interpolate the curve.
**interp_kwargs – Additional arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated values of the curve at the specified time values.

Raises:

Exception – If self.t is None, which is needed for interpolation.

logspace_sampling(sol, interpolation_num, integration_direction)

Sample the curve using logspace sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

resample(n_points, tol=0.0001, inplace=True)

Resample the curve with the specified number of points.

Parameters:

n_points (int) – An integer specifying the number of points in the resampled curve.
tol (float) – A float specifying the tolerance for removing redundant points. Default is 1e-4.
inplace (bool) – A boolean flag indicating whether to modify the curve object in place. Default is True.

Return type:

Returns:

A tuple containing the resampled curve coordinates and time values (if available).

Raises:

ValueError – If the specified number of points is less than 2.

set_time(t, sort=True)

Set the time stamps for the trajectory. Sorts the time stamps if requested.

Parameters:

t (ndarray) – trajectory times, shape (n_points,)
sort (bool) – whether to sort the time stamps. Defaults to True.

Return type:

class dynamo.pd.Trajectory(X, t=None, sort=True)[source]

Base class for handling trajectory interpolation, resampling, etc.

Initializes a Trajectory object.

Parameters:

X (ndarray) – trajectory positions, shape (n_points, n_dimensions)
t (Optional[ndarray]) – trajectory times, shape (n_points,). Defaults to None.
sort (bool) – whether to sort the time stamps. Defaults to True.

set_time(t, sort=True)[source]

Set the time stamps for the trajectory. Sorts the time stamps if requested.

Parameters:

t (ndarray) – trajectory times, shape (n_points,)
sort (bool) – whether to sort the time stamps. Defaults to True.

Return type:

dim()[source]

Returns the number of dimensions in the trajectory.

Return type:: int
Returns:: number of dimensions in the trajectory

calc_tangent(normalize=True)[source]

Calculate the tangent vectors of the trajectory.

Parameters:: normalize (bool) – whether to normalize the tangent vectors. Defaults to True.
Returns:: tangent vectors of the trajectory, shape (n_points-1, n_dimensions)

calc_arclength()[source]

Calculate the arc length of the trajectory.

Return type:: float
Returns:: arc length of the trajectory

calc_curvature()[source]

Calculate the curvature of the trajectory.

Return type:: ndarray
Returns:: curvature of the trajectory, shape (n_points,)

resample(n_points, tol=0.0001, inplace=True)[source]

Resample the curve with the specified number of points.

Parameters:

n_points (int) – An integer specifying the number of points in the resampled curve.
tol (float) – A float specifying the tolerance for removing redundant points. Default is 1e-4.
inplace (bool) – A boolean flag indicating whether to modify the curve object in place. Default is True.

Return type:

Returns:

A tuple containing the resampled curve coordinates and time values (if available).

Raises:

ValueError – If the specified number of points is less than 2.

archlength_sampling(sol, interpolation_num, integration_direction)[source]

Sample the curve using archlength sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

logspace_sampling(sol, interpolation_num, integration_direction)[source]

Sample the curve using logspace sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

interpolate(t, **interp_kwargs)[source]

Interpolate the curve at new time values.

Parameters:

t (ndarray) – The new time values at which to interpolate the curve.
**interp_kwargs – Additional arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated values of the curve at the specified time values.

Raises:

Exception – If self.t is None, which is needed for interpolation.

interp_t(num=100)[source]

Interpolates the t parameter linearly.

Parameters:: num (int) – Number of interpolation points.
Return type:: ndarray
Returns:: The array of interpolated t values.

interp_X(num=100, **interp_kwargs)[source]

Interpolates the curve at num equally spaced points in t.

Parameters:

num (int) – The number of points to interpolate the curve at.
**interp_kwargs – Additional keyword arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated curve at num equally spaced points in t.

integrate(func)[source]

Calculate the integral of a function along the curve.

Parameters:: func (Callable) – A function to integrate along the curve.
Return type:: ndarray
Returns:: The integral of func along the discrete curve.

calc_msd(decomp_dim=True, ref=0)[source]

Calculate the mean squared displacement (MSD) of the curve with respect to a reference point.

Parameters:

decomp_dim (bool) – If True, return the MSD of each dimension separately. If False, return the total MSD.
ref (int) – Index of the reference point. Default is 0.

Return type:

Returns:

The MSD of the curve with respect to the reference point.

class dynamo.pd.LeastActionPath(X, vf_func, D=1, dt=1)[source]

A class for computing the Least Action Path for a given function and initial conditions.

Parameters:

X (ndarray) – The initial conditions as a 2D array of shape (n, m), where n is the number of points in the trajectory and m is the dimension of the system.
vf_func (Callable) – The vector field function that governs the system.
D (float) – The diffusion constant of the system. Defaults to 1.
dt (float) – The time step for the simulation. Defaults to 1.

func: The vector field function that governs the system.

D: The diffusion constant of the system.

_action: The Least Action Path action values for each point in the trajectory.

get_t()[source]: Returns the time points of the least action path.

get_dt()[source]: Returns the time step of the least action path.

action(t=None, **interp_kwargs): Returns the Least Action Path action values at time t. If t is None, returns the action values for all time points. **interp_kwargs are passed to the interp1d function.

mfpt(action=None)[source]: Returns the mean first passage time using the action values. If action is None, uses the action values stored in the _action attribute.

optimize_dt()[source]: Optimizes the time step of the simulation to minimize the Least Action Path action.

Initializes the LeastActionPath class instance with the given initial conditions, vector field function, diffusion constant and time step.

Parameters:

X (ndarray) – The initial conditions as a 2D array of shape (n, m), where n is the number of points in the trajectory and m is the dimension of the system.
vf_func (Callable) – The vector field function that governs the system.
D (float) – The diffusion constant of the system. Defaults to 1.
dt (float) – The time step for the simulation. Defaults to 1.

get_t()[source]

Returns the time points of the trajectory.

Return type:: ndarray
Returns:: The time points of the trajectory.

get_dt()[source]

Returns the time step of the trajectory.

Return type:: float
Returns:: The time step of the trajectory.

action_t(t=None, **interp_kwargs)[source]

Returns the Least Action Path action values at time t.

Parameters:

t (Optional[float]) – The time point(s) to return the action value(s) for. If None, returns the action values for all time points. Defaults to None.
**interp_kwargs – Additional keyword arguments to pass to the interp1d function.

Return type:

Returns:

The Least Action Path action value(s).

mfpt(action=None)[source]

Eqn. 7 of Epigenetics as a first exit problem.

Parameters:: action (Optional[ndarray]) – The action values. If None, uses the action values stored in the _action attribute.
Return type:: ndarray
Returns:: The mean first passage time.

optimize_dt()[source]

Optimizes the time step of the simulation to minimize the Least Action Path action.

Return type:: float
Returns:: Optimal time step

archlength_sampling(sol, interpolation_num, integration_direction)

Sample the curve using archlength sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

calc_arclength()

Calculate the arc length of the trajectory.

Return type:: float
Returns:: arc length of the trajectory

calc_curvature()

Calculate the curvature of the trajectory.

Return type:: ndarray
Returns:: curvature of the trajectory, shape (n_points,)

calc_msd(decomp_dim=True, ref=0)

Calculate the mean squared displacement (MSD) of the curve with respect to a reference point.

Parameters:

decomp_dim (bool) – If True, return the MSD of each dimension separately. If False, return the total MSD.
ref (int) – Index of the reference point. Default is 0.

Return type:

Returns:

The MSD of the curve with respect to the reference point.

calc_tangent(normalize=True)

Calculate the tangent vectors of the trajectory.

Parameters:: normalize (bool) – whether to normalize the tangent vectors. Defaults to True.
Returns:: tangent vectors of the trajectory, shape (n_points-1, n_dimensions)

dim()

Returns the number of dimensions in the trajectory.

Return type:: int
Returns:: number of dimensions in the trajectory

integrate(func)

Calculate the integral of a function along the curve.

Parameters:: func (Callable) – A function to integrate along the curve.
Return type:: ndarray
Returns:: The integral of func along the discrete curve.

interp_X(num=100, **interp_kwargs)

Interpolates the curve at num equally spaced points in t.

Parameters:

num (int) – The number of points to interpolate the curve at.
**interp_kwargs – Additional keyword arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated curve at num equally spaced points in t.

interp_t(num=100)

Interpolates the t parameter linearly.

Parameters:: num (int) – Number of interpolation points.
Return type:: ndarray
Returns:: The array of interpolated t values.

interpolate(t, **interp_kwargs)

Interpolate the curve at new time values.

Parameters:

t (ndarray) – The new time values at which to interpolate the curve.
**interp_kwargs – Additional arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated values of the curve at the specified time values.

Raises:

Exception – If self.t is None, which is needed for interpolation.

logspace_sampling(sol, interpolation_num, integration_direction)

Sample the curve using logspace sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

resample(n_points, tol=0.0001, inplace=True)

Resample the curve with the specified number of points.

Parameters:

n_points (int) – An integer specifying the number of points in the resampled curve.
tol (float) – A float specifying the tolerance for removing redundant points. Default is 1e-4.
inplace (bool) – A boolean flag indicating whether to modify the curve object in place. Default is True.

Return type:

Returns:

A tuple containing the resampled curve coordinates and time values (if available).

Raises:

ValueError – If the specified number of points is less than 2.

set_time(t, sort=True)

Set the time stamps for the trajectory. Sorts the time stamps if requested.

Parameters:

t (ndarray) – trajectory times, shape (n_points,)
sort (bool) – whether to sort the time stamps. Defaults to True.

Return type:

class dynamo.pd.GeneLeastActionPath(adata, lap=None, X_pca=None, vf_func=None, D=1, dt=1, **kwargs)[source]

A class for computing the least action path trajectory and action for a gene expression dataset. Inherits from GeneTrajectory class.

adata: AnnData object containing the gene expression dataset.

X: Expression data.

to_pca: Transformation matrix from gene expression space to PCA space.

from_pca: Transformation matrix from PCA space to gene expression space.

PCs: Principal components from PCA analysis.

func: Vector field function reconstructed within the PCA space.

D: Diffusivity value.

t: Array of time values.

action: Array of action values.

Initializes the GeneLeastActionPath class instance.

Parameters:

adata (AnnData) – AnnData object containing the gene expression dataset.
lap (Optional[LeastActionPath]) – LeastActionPath object. Defaults to None.
X_pca (Optional[ndarray]) – PCA transformed expression data. Defaults to None.
vf_func (Optional[Callable]) – Vector field function. Defaults to None.
D (float) – Diffusivity value. Defaults to 1.
dt (float) – Time step size. Defaults to 1.
**kwargs – Additional keyword arguments passed to the GeneTrajectory class.

from_lap(adata, lap, **kwargs)[source]

Initializes class from a LeastActionPath object.

Parameters:

adata (AnnData) – AnnData object containing the gene expression dataset.
lap (LeastActionPath) – LeastActionPath object.
**kwargs – Additional keyword arguments passed to the GeneTrajectory class.

get_t()[source]

Returns the array of time values.

Return type:: ndarray
Returns:: Array of time values.

get_dt()[source]

Returns the average time step size.

Return type:: float
Returns:: Average time step size.

genewise_action()[source]

Calculates the genewise action values.

Return type:: ndarray
Returns:: Array of genewise action values.

select_genewise_action(genes)[source]

Returns the genewise action values for the specified genes.

Parameters:: genes (Union[str, List[str]]) – List of gene names or a single gene name.
Return type:: ndarray
Returns:: Array of genewise action values.

archlength_sampling(sol, interpolation_num, integration_direction)

Sample the curve using archlength sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

calc_arclength()

Calculate the arc length of the trajectory.

Return type:: float
Returns:: arc length of the trajectory

calc_curvature()

Calculate the curvature of the trajectory.

Return type:: ndarray
Returns:: curvature of the trajectory, shape (n_points,)

calc_msd(save_key='traj_msd', **kwargs)

Calculate the mean squared displacement (MSD) of the gene expression trajectory.

Parameters:

save_key (str) – The key to save the MSD data to in adata.var. Defaults to “traj_msd”.
**kwargs – Additional keyword arguments to be passed to the superclass method.

Return type:

Returns:

The mean squared displacement of the gene expression trajectory.

calc_tangent(normalize=True)

Calculate the tangent vectors of the trajectory.

Parameters:: normalize (bool) – whether to normalize the tangent vectors. Defaults to True.
Returns:: tangent vectors of the trajectory, shape (n_points-1, n_dimensions)

dim()

Returns the number of dimensions in the trajectory.

Return type:: int
Returns:: number of dimensions in the trajectory

from_pca(X_pca, t=None)

Converts PCA-transformed gene expression data to gene expression data.

Parameters:

X_pca (ndarray) – The PCA-transformed gene expression data as a numpy array of shape (n, d).
t (Optional[ndarray]) – The time data as a numpy array of shape (n,). Defaults to None.

Return type:

genes_to_mask()

Returns a boolean mask for the genes in the trajectory.

Return type:: ndarray
Returns:: A boolean mask for the genes in the trajectory.

integrate(func)

Calculate the integral of a function along the curve.

Parameters:: func (Callable) – A function to integrate along the curve.
Return type:: ndarray
Returns:: The integral of func along the discrete curve.

interp_X(num=100, **interp_kwargs)

Interpolates the curve at num equally spaced points in t.

Parameters:

num (int) – The number of points to interpolate the curve at.
**interp_kwargs – Additional keyword arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated curve at num equally spaced points in t.

interp_t(num=100)

Interpolates the t parameter linearly.

Parameters:: num (int) – Number of interpolation points.
Return type:: ndarray
Returns:: The array of interpolated t values.

interpolate(t, **interp_kwargs)

Interpolate the curve at new time values.

Parameters:

t (ndarray) – The new time values at which to interpolate the curve.
**interp_kwargs – Additional arguments to pass to scipy.interpolate.interp1d.

Return type:

Returns:

The interpolated values of the curve at the specified time values.

Raises:

Exception – If self.t is None, which is needed for interpolation.

logspace_sampling(sol, interpolation_num, integration_direction)

Sample the curve using logspace sampling.

Parameters:

sol (OdeSolution) – The ODE solution from scipy.integrate.solve_ivp.
interpolation_num (int) – The number of points to interpolate the curve at.
integration_direction (str) – The direction to integrate the curve in. Can be “forward”, “backward”, or “both”.

Return type:

resample(n_points, tol=0.0001, inplace=True)

Resample the curve with the specified number of points.

Parameters:

n_points (int) – An integer specifying the number of points in the resampled curve.
tol (float) – A float specifying the tolerance for removing redundant points. Default is 1e-4.
inplace (bool) – A boolean flag indicating whether to modify the curve object in place. Default is True.

Return type:

Returns:

A tuple containing the resampled curve coordinates and time values (if available).

Raises:

ValueError – If the specified number of points is less than 2.

save(save_key='gene_trajectory')

Save the gene expression trajectory to adata.var.

Parameters:: save_key (str) – The key to save the gene expression trajectory to in adata.var. Defaults to “gene_trajectory”.
Return type:: None

select_gene(genes, arr=None, axis=None)

Selects the gene expression data for the specified genes.

Parameters:

genes (Union[ndarray, list]) – The genes to select the expression data for.
arr (Optional[ndarray]) – The array to select the genes from. Defaults to None.
axis (Optional[int]) – The axis to select the genes along. Defaults to None.

Return type:

Returns:

The gene expression data for the specified genes.

set_time(t, sort=True)

Set the time stamps for the trajectory. Sorts the time stamps if requested.

Parameters:

t (ndarray) – trajectory times, shape (n_points,)
sort (bool) – whether to sort the time stamps. Defaults to True.

Return type: