Initialization of ODESystems

While for simple numerical ODEs choosing an initial condition can be an easy affair, with ModelingToolkit's more general differential-algebraic equation (DAE) system there is more care needed due to the flexibility of the solver state. In this tutorial we will walk through the functionality involved in initialization of ODESystem and the diagnostics to better understand and debug the initialization problem.

Primer on Initialization of Differential-Algebraic Equations

Before getting started, let's do a brief walkthrough of the mathematical principles of initialization of DAE systems. Take a DAE written in semi-explicit form:

\[x' = f(x,y,t)\\ 0 = g(x,y,t)\]

where $x$ are the differential variables and $y$ are the algebraic variables. An initial condition $u0 = [x(t_0) y(t_0)]$ is said to be consistent if $g(x(t_0),y(t_0),t_0) = 0$.

For ODEs, this is trivially satisfied. However, for more complicated systems it may not be easy to know how to choose the variables such that all of the conditions are satisfied. This is even more complicated when taking into account ModelingToolkit's simplification engine, given that variables can be eliminated and equations can be changed. If this happens, how do you know how to initialize the system?

Initialization By Example: The Cartesian Pendulum

To illustrate how to perform the initialization, let's take a look at the Cartesian pendulum:

using ModelingToolkit, OrdinaryDiffEq, Plots
using ModelingToolkit: t_nounits as t, D_nounits as D

@parameters g
@variables x(t) y(t) [state_priority = 10] λ(t)
eqs = [D(D(x)) ~ λ * x
       D(D(y)) ~ λ * y - g
       x^2 + y^2 ~ 1]
@mtkbuild pend = ODESystem(eqs, t)

\[ \begin{align} \frac{\mathrm{d} yˍt\left( t \right)}{\mathrm{d}t} =& - g + y\left( t \right) \lambda\left( t \right) \\ \frac{\mathrm{d} y\left( t \right)}{\mathrm{d}t} =& yˍt\left( t \right) \\ 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \end{align} \]

While we defined the system using second derivatives and a length constraint, the structural simplification system improved the numerics of the system to be solvable using the dummy derivative technique, which results in 3 algebraic equations and 2 differential equations. In this case, the differential equations with respect to y and D(y), though it could have just as easily have been x and D(x). How do you initialize such a system if you don't know in advance what variables may defined the equation's state?

To see how the system works, let's start the pendulum in the far right position, i.e. x(0) = 1 and y(0) = 0. We can do this by:

prob = ODEProblem(pend, [x => 1, y => 0], (0.0, 1.5), [g => 1], guesses = [λ => 1])
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 1.5)
u0: 5-element Vector{Float64}:
 0.0
 0.0
 0.0
 1.0
 0.0

This solves via:

sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

and we can check it satisfies our conditions via:

conditions = getfield.(equations(pend)[3:end], :rhs)
3-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 1 - (x(t)^2) - (y(t)^2)
 -2xˍt(t)*x(t) - 2y(t)*yˍt(t)
 -2xˍtt(t)*x(t) - 2(xˍt(t)^2) - 2(yˍt(t)^2) - 2y(t)*(-g + y(t)*λ(t))
[sol[conditions][1]; sol[x][1] - 1; sol[y][1]]
5-element Vector{Float64}:
  2.220446049250313e-16
  3.422513820163176e-16
 -8.881784197001251e-16
 -1.1102230246251565e-16
  0.0

Notice that we set [x => 1, y => 0] as our initial conditions and [λ => 1] as our guess. The difference is that the initial conditions are required to be satisfied, while the guesses are simply a guess for what the initial value might be. Every variable must have either an initial condition or a guess, and thus since we did not know what λ would be we set it to 1 and let the initialization scheme find the correct value for λ. Indeed, the value for λ at the initial time is not 1:

sol[λ][1]
4.440892098500626e-16

We can similarly choose λ = 0 and solve for y to start the system:

prob = ODEProblem(pend, [x => 1, λ => 0], (0.0, 1.5), [g => 1], guesses = [y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

or choose to satisfy derivative conditions:

prob = ODEProblem(
    pend, [x => 1, D(y) => 0], (0.0, 1.5), [g => 1], guesses = [λ => 0, y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

Notice that since a derivative condition is given, we are required to give a guess for y.

We can also directly give equations to be satisfied at the initial point by using the initialization_eqs keyword argument, for example:

prob = ODEProblem(pend, [x => 1], (0.0, 1.5), [g => 1], guesses = [λ => 0, y => 1],
    initialization_eqs = [y ~ 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

Additionally, note that the initial conditions are allowed to be functions of other variables and parameters:

prob = ODEProblem(
    pend, [x => 1, D(y) => g], (0.0, 3.0), [g => 1], guesses = [λ => 0, y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

Determinability: Underdetermined and Overdetermined Systems

For this system we have 3 conditions to satisfy:

conditions = getfield.(equations(pend)[3:end], :rhs)
3-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 1 - (x(t)^2) - (y(t)^2)
 -2xˍt(t)*x(t) - 2y(t)*yˍt(t)
 -2xˍtt(t)*x(t) - 2(xˍt(t)^2) - 2(yˍt(t)^2) - 2y(t)*(-g + y(t)*λ(t))

when we initialize with

prob = ODEProblem(pend, [x => 1, y => 0], (0.0, 1.5), [g => 1], guesses = [y => 0, λ => 1])
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 1.5)
u0: 5-element Vector{Float64}:
 0.0
 0.0
 0.0
 1.0
 0.0

we have two extra conditions to satisfy, x ~ 1 and y ~ 0 at the initial point. That gives 5 equations for 5 variables and thus the system is well-formed. What happens if that's not the case?

prob = ODEProblem(pend, [x => 1], (0.0, 1.5), [g => 1], guesses = [y => 0, λ => 1])
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 1.5)
u0: 5-element Vector{Float64}:
 0.0
 0.0
 0.0
 1.0
 0.0

Here we have 4 equations for 5 unknowns (note: the warning is post-simplification of the nonlinear system, which solves the trivial x ~ 1 equation analytical and thus says 3 equations for 4 unknowns). This warning thus lets you know the system is underdetermined and thus the solution is not necessarily unique. It can still be solved:

sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

and the found initial condition satisfies all constraints which were given. In the opposite direction, we may have an overdetermined system:

prob = ODEProblem(
    pend, [x => 1, y => 0.0, D(y) => 0], (0.0, 1.5), [g => 1], guesses = [λ => 1])
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 1.5)
u0: 5-element Vector{Float64}:
 0.0
 0.0
 0.0
 1.0
 0.0

Can that be solved?

sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Example block output

Indeed since we saw D(y) = 0 at the initial point above, it turns out that this solution is solvable with the chosen initial conditions. However, for overdetermined systems we often aren't that lucky. If the set of initial conditions cannot be satisfied, then you will get a SciMLBase.ReturnCode.InitialFailure:

prob = ODEProblem(
    pend, [x => 1, y => 0.0, D(y) => 2.0, λ => 1], (0.0, 1.5), [g => 1], guesses = [λ => 1])
sol = solve(prob, Rodas5P())
retcode: InitialFailure
Interpolation: specialized 4rd order "free" stiffness-aware interpolation
t: 1-element Vector{Float64}:
 0.0
u: 1-element Vector{Vector{Float64}}:
 [2.0, 0.0, 0.0, 1.4047352827081332, 1.0]

What this means is that the initial condition finder failed to find an initial condition. While this can be sometimes due to numerical error (which is then helped by picking guesses closer to the correct value), most circumstances of this come from ill-formed models. Especially if your system is overdetermined and you receive an InitialFailure, the initial conditions may not be analytically satisfiable!. In our case here, if you sit down with a pen and paper long enough you will see that λ = 0 is required for this equation, but since we chose λ = 1 we end up with a set of equations that are impossible to satisfy.

Diving Deeper: Constructing the Initialization System

To get a better sense of the initialization system and to help debug it, you can construct the initialization system directly. The initialization system is a NonlinearSystem which requires the system-level information and the additional nonlinear equations being tagged to the system.

isys = generate_initializesystem(pend, u0map = [x => 1.0, y => 0.0], guesses = [λ => 1])

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& - y\left( t \right) \\ 0 =& 1 - x\left( t \right) \\ 0 =& - xˍtt\left( t \right) + x\left( t \right) \lambda\left( t \right) \end{align} \]

We can inspect what its equations and unknown values are:

equations(isys)

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& - y\left( t \right) \\ 0 =& 1 - x\left( t \right) \\ 0 =& - xˍtt\left( t \right) + x\left( t \right) \lambda\left( t \right) \end{align} \]

unknowns(isys)
6-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 yˍt(t)
 y(t)
 xˍt(t)
 x(t)
 λ(t)
 xˍtt(t)

Notice that all initial conditions are treated as initial equations. Additionally, for systems with observables, those observables are too treated as initial equations. We can see the resulting simplified system via the command:

isys = structural_simplify(isys; fully_determined = false)

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& 1 - x\left( t \right) \end{align} \]

Note fully_determined=false allows for the simplification to occur when the number of equations does not match the number of unknowns, which we can use to investigate our overdetermined system:

isys = ModelingToolkit.generate_initializesystem(
    pend, u0map = [x => 1, y => 0.0, D(y) => 2.0, λ => 1], guesses = [λ => 1])

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& 2 - yˍt\left( t \right) \\ 0 =& - y\left( t \right) \\ 0 =& 1 - x\left( t \right) \\ 0 =& 1 - \lambda\left( t \right) \\ 0 =& - xˍtt\left( t \right) + x\left( t \right) \lambda\left( t \right) \end{align} \]

isys = structural_simplify(isys; fully_determined = false)

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& 1 - x\left( t \right) \\ 0 =& - xˍtt\left( t \right) + x\left( t \right) \lambda\left( t \right) \end{align} \]

equations(isys)

\[ \begin{align} 0 =& 1 - \left( x\left( t \right) \right)^{2} - \left( y\left( t \right) \right)^{2} \\ 0 =& - 2 xˍt\left( t \right) x\left( t \right) - 2 y\left( t \right) yˍt\left( t \right) \\ 0 =& - 2 xˍtt\left( t \right) x\left( t \right) - 2 \left( xˍt\left( t \right) \right)^{2} - 2 \left( yˍt\left( t \right) \right)^{2} - 2 y\left( t \right) \left( - g + y\left( t \right) \lambda\left( t \right) \right) \\ 0 =& 1 - x\left( t \right) \\ 0 =& - xˍtt\left( t \right) + x\left( t \right) \lambda\left( t \right) \end{align} \]

unknowns(isys)
3-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 x(t)
 xˍt(t)
 xˍtt(t)
observed(isys)

\[ \begin{align} y\left( t \right) =& 0 \\ yˍt\left( t \right) =& 2 \\ \lambda\left( t \right) =& 1 \end{align} \]

After simplification we see that we have 5 equatinos to solve with 3 variables, and the system that is given is not solvable.

Numerical Isolation: InitializationProblem

To inspect the numerics of the initialization problem, we can use the InitializationProblem constructor which acts just like an ODEProblem or NonlinearProblem constructor, but creates the special initialization system for a given sys. This is done as follows:

iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
    [x => 1, y => 0.0, D(y) => 2.0, λ => 1], [g => 1], guesses = [λ => 1])
NonlinearLeastSquaresProblem with uType Vector{Float64}. In-place: true
u0: 3-element Vector{Float64}:
 1.0
 0.0
 0.0

We can see that because the system is overdetermined we receive a NonlinearLeastSquaresProblem, solvable by NonlinearSolve.jl. Using NonlinearSolve we can recreate the initialization solve directly:

using NonlinearSolve
sol = solve(iprob)
retcode: MaxIters
u: 3-element Vector{Float64}:
  1.4047352827081332
  0.0
 -2.3483631700017935
Note

For more information on solving NonlinearProblems and NonlinearLeastSquaresProblems, check out the NonlinearSolve.jl tutorials!.

We can see that the default solver stalls

sol.stats
NonlinearSolve.ImmutableNLStats
Number of function evaluations:                    390
Number of Jacobians created:                       390
Number of factorizations:                          1000
Number of linear solves:                           2000
Number of nonlinear solver iterations:             1000

after doing many iterations, showing that it tried to compute but could not find a valid solution. Trying other solvers:

sol = solve(iprob, GaussNewton())
retcode: MaxIters
u: 3-element Vector{Float64}:
  1.2497048341002366
  0.0
 -2.618161342987903

gives the same issue, indicating that the chosen initialization system is unsatisfiable. We can check the residuals:

sol.resid
5-element Vector{Float64}:
 -0.5617621723734998
 -0.0
 -1.4561422264272998
 -0.24970483410023658
  3.86786617708814

to see the problem is not equation 2 but other equations in the system. Meanwhile, changing some of the conditions:

iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
    [x => 1, y => 0.0, D(y) => 0.0, λ => 0], [g => 1], guesses = [λ => 1])
NonlinearLeastSquaresProblem with uType Vector{Float64}. In-place: true
u0: 3-element Vector{Float64}:
 1.0
 0.0
 0.0

gives a NonlinearLeastSquaresProblem which can be solved:

sol = solve(iprob)
retcode: Success
u: 3-element Vector{Float64}:
 1.0
 0.0
 0.0
sol.resid
5-element Vector{Float64}:
  0.0
 -0.0
  0.0
  0.0
  0.0

In comparison, if we have a well-conditioned system:

iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
    [x => 1, y => 0.0], [g => 1], guesses = [λ => 1])
NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 4-element Vector{Float64}:
 1.0
 0.0
 0.0
 1.0

notice that we instead obtained a NonlinearSystem. In this case we have to use different solvers which can take advantage of the fact that the Jacobian is square.

sol = solve(iprob)
retcode: Stalled
u: 4-element Vector{Float64}:
 1.0
 0.0
 0.0
 1.0
sol = solve(iprob, TrustRegion())
retcode: Stalled
u: 4-element Vector{Float64}:
 1.0
 0.0
 0.0
 1.0

More Features of the Initialization System: Steady-State and Observable Initialization

Let's take a Lotka-Volterra system:

@variables x(t) y(t) z(t)
@parameters α=1.5 β=1.0 γ=3.0 δ=1.0

eqs = [D(x) ~ α * x - β * x * y
       D(y) ~ -γ * y + δ * x * y
       z ~ x + y]

@named sys = ODESystem(eqs, t)
simpsys = structural_simplify(sys)
tspan = (0.0, 10.0)
(0.0, 10.0)

Using the derivative initializations, we can set the ODE to start at the steady state by initializing the derivatives to zero:

prob = ODEProblem(simpsys, [D(x) => 0.0, D(y) => 0.0], tspan, guesses = [x => 1, y => 1])
sol = solve(prob, Tsit5(), abstol = 1e-16)
retcode: Success
Interpolation: specialized 4th order "free" interpolation
t: 20-element Vector{Float64}:
  0.0
  0.1201147449058721
  0.3033677012595945
  0.5226445376056614
  0.8096545291701382
  1.1567042588607341
  1.5820106601878692
  2.0622600122778
  2.587349981311638
  3.151568202355532
  3.7515824198060446
  4.383799169018585
  5.0448096404806435
  5.731241885792306
  6.439957041279295
  7.168062853507668
  7.913012158560047
  8.672505206322072
  9.444540328352346
 10.0
u: 20-element Vector{Vector{Float64}}:
 [-1.1634600742357695e-14, -6.066200752526776e-15]
 [-1.3931543678584736e-14, -4.230789503183478e-15]
 [-1.833909332738822e-14, -2.4415588380621904e-15]
 [-2.5481437876937855e-14, -1.2646960643646614e-15]
 [-3.919170251296605e-14, -5.347222159111305e-16]
 [-6.595936396370302e-14, -1.8893680940222076e-16]
 [-1.248352795238849e-13, -5.295919439039544e-17]
 [-2.565597058329055e-13, -1.2673065184685261e-17]
 [-5.63959400555243e-13, -2.6815076764227088e-18]
 [-1.3145988817119287e-12, -5.136290018620361e-19]
 [-3.2333503713724103e-12, -9.072919654030204e-20]
 [-8.34611501499083e-12, -1.507789875079988e-20]
 [-2.2493989566270448e-11, -2.402949167872516e-21]
 [-6.297988493310315e-11, -3.736693077018153e-22]
 [-1.8232462337351928e-10, -5.751602131358583e-23]
 [-5.433912549156384e-10, -8.859376342565196e-24]
 [-1.660904274132437e-9, -1.3762280318545326e-24]
 [-5.1885183248563635e-9, -2.1667597177797248e-25]
 [-1.6516018467495274e-8, -3.467283732272943e-26]
 [-3.7996716090602424e-8, -6.785326728891389e-27]

Notice that this is a "numerical zero", not an exact zero, and thus the solution will leave the steady state in this instance because it's an unstable steady state.

Additionally, notice that in this setup we have an observable z ~ x + y. If we instead know the initial condition for the observable we can use that directly:

prob = ODEProblem(simpsys, [D(x) => 0.0, z => 2.0], tspan, guesses = [x => 1, y => 1])
sol = solve(prob, Tsit5())
retcode: Success
Interpolation: specialized 4th order "free" interpolation
t: 40-element Vector{Float64}:
  0.0
  0.07205457770130536
  0.19555442269989384
  0.34017814505280075
  0.5197315103056518
  0.7296409259418426
  0.9859475408467189
  1.3125687678101077
  1.6001372056226688
  1.9779707269850744
  ⋮
  7.638683900582808
  7.922192252177743
  8.211758012199246
  8.566118746236766
  8.911012923620799
  9.277839507710738
  9.722952509888755
  9.995366972228988
 10.0
u: 40-element Vector{Vector{Float64}}:
 [0.5, 1.5]
 [0.5046075911240547, 1.252874833326255]
 [0.5315611606386604, 0.9219106789320841]
 [0.590222611475832, 0.6476528764221151]
 [0.7028731294549116, 0.4241579141170538]
 [0.8969640418246153, 0.26693506789552807]
 [1.2488272566229397, 0.16238847150562624]
 [1.9552798314187778, 0.10189943055027023]
 [2.931616153894773, 0.08599023296027664]
 [4.983887223177531, 0.11950411906957735]
 ⋮
 [0.5449166247077523, 0.8423728527272625]
 [0.701342428151023, 0.4282597751625187]
 [0.9881818042209071, 0.22863436777692955]
 [1.5844147971140083, 0.12337299769259767]
 [2.5660310409135683, 0.08841670433558013]
 [4.304643693656804, 0.10096494793121592]
 [7.736269630802351, 0.3691622780681822]
 [9.214156892625953, 1.7501409322891812]
 [9.202400785713033, 1.8012129176885996]

We can check that indeed the solution does satisfy that D(x) = 0 at the start:

sol[α * x - β * x * y]
40-element Vector{Float64}:
  0.0
  0.1247012350613691
  0.3072898304596765
  0.5030745451620529
  0.7561804937038463
  1.106014905332622
  1.6704457355568458
  2.7336778457401647
  4.145333874818638
  6.880235782618343
  ⋮
  0.358351965408048
  0.7516568916346424
  1.256340384274648
  2.1811481925625484
  3.6221665535099765
  6.022347414092711
  8.748465525546836
 -2.304837795380241
 -2.7718819904044842
plot(sol)
Example block output