Symbolic Stoichiometries

Catalyst supports stoichiometric coefficients that involve parameters, species, or even general expressions. In this tutorial we show several examples of how to use symbolic stoichiometries, and discuss several caveats to be aware of.

Using symbolic stoichiometry

Let's first consider a simple reversible reaction where the number of reactants is a parameter, and the number of products is the product of two parameters.

using Catalyst, Latexify, OrdinaryDiffEqTsit5, ModelingToolkit, Plots
revsys = @reaction_network revsys begin
    @parameters m::Int64 n::Int64
    k₊, m*A --> (m*n)*B
    k₋, B --> A
end
reactions(revsys)

2-element Vector{Reaction}:
 k₊, m*A --> (m*n)*B
 k₋, B --> A

Notice, as described in the Reaction rate laws used in simulations section, the default rate laws involve factorials in the stoichiometric coefficients. For this reason we explicitly specify m and n as integers (as otherwise ModelingToolkit will implicitly assume they are floating point).

As always the @reaction_network macro defaults to setting all symbols neither used as a reaction substrate nor a product to be parameters. Hence, in this example we have two species (A and B) and four parameters (k₊, k₋, m, and n). In addition, the stoichiometry is applied to the rightmost symbol in a given term, i.e. in the first equation the substrate A has stoichiometry m and the product B has stoichiometry m*n. For example, in

rn = @reaction_network begin
    k, A*C --> 2B
    end
reactions(rn)

1-element Vector{Reaction}:
 k, A*C --> 2*B

we see two species, (B,C), with A treated as a parameter representing the stoichiometric coefficient of C, i.e.

rx = reactions(rn)[1]
rx.substrates[1],rx.substoich[1]

(C(t), A)

We could have equivalently specified our systems directly via the Catalyst API. For example, for revsys we would could use

t = default_t()
@parameters k₊ k₋ m::Int n::Int
@species A(t), B(t)
rxs = [Reaction(k₊, [A], [B], [m], [m*n]),
       Reaction(k₋, [B], [A])]
revsys2 = ReactionSystem(rxs,t; name=:revsys)
revsys2 == revsys

false

rxs2 = [(@reaction k₊, $m*A --> ($m*$n)*B),
        (@reaction k₋, B --> A)]
revsys3 = ReactionSystem(rxs2,t; name=:revsys)
revsys3 == revsys

false

Here we interpolate in the pre-declared m and n symbolic variables using $m and $n to ensure the parameter is known to be integer-valued. The @reaction macro again assumes all symbols are parameters except the substrates or reactants (i.e. A and B). For example, in @reaction k, F*A + 2(H*G+B) --> D, the substrates are (A,G,B) with stoichiometries (F,2*H,2).

Let's now convert revsys to ODEs and look at the resulting equations:

osys = convert(ODESystem, revsys)
osys = complete(osys)
equations(osys)

2-element Vector{Equation}:
 Differential(t)(A(t)) ~ (-k₊*m*(A(t)^m)) / factorial(m) + k₋*B(t)
 Differential(t)(B(t)) ~ (k₊*m*n*(A(t)^m)) / factorial(m) - k₋*B(t)

Specifying the parameter and initial condition values,

p  = (revsys.k₊ => 1.0, revsys.k₋ => 1.0, revsys.m => 2, revsys.n => 2)
u₀ = [revsys.A => 1.0, revsys.B => 1.0]
oprob = ODEProblem(osys, u₀, (0.0, 1.0), p)

we can now solve and plot the system

sol = solve(oprob, Tsit5())
plot(sol)

An alternative approach to avoid the issues of using mixed floating point and integer variables is to disable the rescaling of rate laws as described in Reaction rate laws used in simulations section. This requires passing the combinatoric_ratelaws=false keyword to convert or to ODEProblem (if directly building the problem from a ReactionSystem instead of first converting to an ODESystem). For the previous example this gives the following (different) system of ODEs where we now let m and n be floating point valued parameters (the default):

revsys = @reaction_network revsys begin
    k₊, m*A --> (m*n)*B
    k₋, B --> A
end
osys = convert(ODESystem, revsys; combinatoric_ratelaws = false)
osys = complete(osys)
equations(osys)

2-element Vector{Equation}:
 Differential(t)(A(t)) ~ k₋*B(t) - k₊*m*(A(t)^m)
 Differential(t)(B(t)) ~ -k₋*B(t) + k₊*m*n*(A(t)^m)

Since we no longer have factorial functions appearing, our example will now run with m and n treated as floating point parameters:

p  = (revsys.k₊ => 1.0, revsys.k₋ => 1.0, revsys.m => 2.0, revsys.n => 2.0)
oprob = ODEProblem(osys, u₀, (0.0, 1.0), p)
sol = solve(oprob, Tsit5())
plot(sol)

Gene expression with randomly produced amounts of protein

As a second example, let's build the negative feedback model from MomentClosure.jl that involves a bursty reaction that produces a random amount of protein.

In our model G₋ will denote the repressed state, and G₊ the active state where the gene can transcribe. P will denote the protein product of the gene. We will assume that proteins are produced in bursts that produce m proteins, where m is a (shifted) geometric random variable with mean b. To define m we must register the Distributions.Geometric distribution from Distributions.jl with Symbolics.jl, after which we can use it in symbolic expressions:

using Distributions: Geometric
@register_symbolic Geometric(b)
@parameters b
m = rand(Geometric(1/b)) + 1

Note, as we require the shifted geometric distribution, we add one to Distributions.jl's Geometric random variable (which includes zero).

We can now define our model

burstyrn = @reaction_network burstyrn begin
    k₊, G₋ --> G₊
    k₋*P^2, G₊ --> G₋
    kₚ, G₊ --> G₊ + $m*P
    γₚ, P --> ∅
end
reactions(burstyrn)

4-element Vector{Reaction}:
 k₊, G₋ --> G₊
 k₋*(P(t)^2), G₊ --> G₋
 kₚ, G₊ --> G₊ + (1 + rand(Distributions.Geometric(1 / b)))*P
 γₚ, P --> ∅

The parameter b does not need to be explicitly declared in the @reaction_network macro as it is detected when the expression rand(Geometric(1/b)) + 1 is substituted for m.

We next convert our network to a jump process representation

using JumpProcesses
jsys = convert(JumpSystem, burstyrn; combinatoric_ratelaws = false)
jsys = complete(jsys)
equations(jsys)

(JumpProcesses.MassActionJump[JumpProcesses.MassActionJump{Num, Vector{Pair{Any, Int64}}, Vector{Pair{Any, Int64}}, Nothing}(k₊, Pair{Any, Int64}[G₋(t) => 1], Pair{Any, Int64}[G₊(t) => 1, G₋(t) => -1], nothing), JumpProcesses.MassActionJump{Num, Vector{Pair{Any, Int64}}, Vector{Pair{Any, Int64}}, Nothing}(γₚ, Pair{Any, Int64}[P(t) => 1], Pair{Any, Int64}[P(t) => -1], nothing)], JumpProcesses.ConstantRateJump[JumpProcesses.ConstantRateJump{SymbolicUtils.BasicSymbolic{Real}, Vector{Equation}}(k₋*G₊(t)*(P(t)^2), Equation[G₊(t) ~ -1 + G₊(t), G₋(t) ~ 1 + G₋(t)]), JumpProcesses.ConstantRateJump{SymbolicUtils.BasicSymbolic{Real}, Vector{Equation}}(kₚ*G₊(t), Equation[P(t) ~ 1 + rand(Distributions.Geometric(1 / b)) + P(t)])], JumpProcesses.VariableRateJump[], Equation[])

Notice, the equations of jsys have three MassActionJumps for the first three reactions, and one ConstantRateJump for the last reaction. If we examine the ConstantRateJump more closely we can see the generated rate and affect! functions for the bursty reaction that makes protein

equations(jsys)[4].rate

kₚ*G₊(t)

equations(jsys)[4].affect!

1-element Vector{Equation}:
 P(t) ~ 1 + rand(Distributions.Geometric(1 / b)) + P(t)

Finally, we can now simulate our JumpSystem

pmean = 200
bval = 70
γₚval = 1
k₋val = 0.001
k₊val = 0.05
kₚval = pmean * γₚval * (k₋val * pmean^2 + k₊val) / (k₊val * bval)
p = (:k₊ => k₊val, :k₋ => k₋val, :kₚ => kₚval, :γₚ => γₚval, :b => bval)
u₀ = [:G₊ => 1, :G₋ => 0, :P => 1]
tspan = (0., 6.0)   # time interval to solve over
dprob = DiscreteProblem(jsys, u₀, tspan, p)
jprob = JumpProblem(jsys, dprob, Direct())
sol = solve(jprob)
plot(sol.t, sol[jsys.P], legend = false, xlabel = "time", ylabel = "P(t)")

To double check our results are consistent with MomentClosure.jl, let's calculate and plot the average amount of protein (which is also plotted in the MomentClosure.jl tutorial).

t = default_t()
function getmean(jprob, Nsims, tv)
    Pmean = zeros(length(tv))
    @variables P(t)
    for n in 1:Nsims
        sol = solve(jprob)
        Pmean .+= sol(tv, idxs=P)
    end
    Pmean ./= Nsims
end
tv = range(tspan[1],tspan[2],step=.1)
psim_mean = getmean(jprob, 20000, tv)
plot(tv, psim_mean; ylabel = "average of P(t)", xlabel = "time",
                    xlim = (0.0,6.0), legend = false)

Comparing, we see similar averages for P(t).