Symbolic Regression of a Linear Time Continuous Systems
Symbolic regression is using regularized evolution, simulated annealing, and gradient-free optimization to find suitable equations. Hence, the performance might differ and depends strongly on the hyperparameters of the optimization. This example might not recover the groundtruth, but is showing off the use within DataDrivenDiffEq.jl
.
DataDrivenDiffEq offers an interface to SymbolicRegression.jl
to infer more complex functions.
using DataDrivenDiffEq
using LinearAlgebra
using OrdinaryDiffEq
using DataDrivenSR
using Plots
A = [-0.9 0.2; 0.0 -0.5]
B = [0.0; 1.0]
u0 = [10.0; -10.0]
tspan = (0.0, 20.0)
f(u, p, t) = A * u .+ B .* sin(0.5 * t)
sys = ODEProblem(f, u0, tspan)
sol = solve(sys, Tsit5(), saveat = 0.01);
We will use the data provided by our problem, but add the control signal U = sin(0.5*t)
to it. Instead of using a function, like in another example
X = Array(sol)
t = sol.t
U = permutedims(sin.(0.5 * t))
prob = ContinuousDataDrivenProblem(X, t, U = U)
Continuous DataDrivenProblem{Float64} ##DDProblem#393 in 2 dimensions and 2001 samples with controls
And plot the problems data.
plot(prob)
To solve our problem, we will use EQSearch
, which provides a wrapper for the symbolic regression interface. We will stick to simple operations, use a L1DistLoss
, and limit the verbosity of the algorithm.
eqsearch_options = SymbolicRegression.Options(binary_operators = [+, *],
loss = L1DistLoss(),
verbosity = -1, progress = false, npop = 30,
timeout_in_seconds = 60.0)
alg = EQSearch(eq_options = eqsearch_options)
EQSearch
weights: Nothing nothing
numprocs: Nothing nothing
procs: Nothing nothing
addprocs_function: Nothing nothing
parallelism: Symbol serial
runtests: Bool true
eq_options: Options{SymbolicRegression.CoreModule.OptionsStructModule.ComplexityMapping{Int64, Int64}, OperatorEnum, Node, Expression, @NamedTuple{}, MutationWeights, false, false, nothing, Nothing, 5}
Again, we solve
the problem to obtain a DataDrivenSolution
. Note that any additional keyword arguments are passed onto symbolic regressions EquationSearch
with the exception of niterations
which is maxiters
res = solve(prob, alg, options = DataDrivenCommonOptions(maxiters = 100))
println(res)
"DataDrivenSolution{Float64}" with 2 equations and 0 parameters.
Returncode: Success
Residual sum of squares: 263685.7920194242
We can inspect the systems metrics, here the loglikelihood
of the result.
loglikelihood(res)
-8380.115916799206
Currently the parameters of the result found by EQSearch
are not turned into symbolic parameters. This affects some functions like dof
, aicc
, bic
.
system = get_basis(res)
Model ##Basis#400 with 2 equations
States : x₁ x₂
Independent variable: t
Equations
Differential(t)(x₁) = -u₁ - x₁ + x₂
Differential(t)(x₂) = 1.0 + u₁ + x₁ - x₂ + x₁*(-1.0 + x₂)
Copy-Pasteable Code
using DataDrivenDiffEq
using LinearAlgebra
using OrdinaryDiffEq
using DataDrivenSR
A = [-0.9 0.2; 0.0 -0.5]
B = [0.0; 1.0]
u0 = [10.0; -10.0]
tspan = (0.0, 20.0)
f(u, p, t) = A * u .+ B .* sin(0.5 * t)
sys = ODEProblem(f, u0, tspan)
sol = solve(sys, Tsit5(), saveat = 0.01);
X = Array(sol)
t = sol.t
U = permutedims(sin.(0.5 * t))
prob = ContinuousDataDrivenProblem(X, t, U = U)
eqsearch_options = SymbolicRegression.Options(binary_operators = [+, *],
loss = L1DistLoss(),
verbosity = -1, progress = false, npop = 30,
timeout_in_seconds = 60.0)
alg = EQSearch(eq_options = eqsearch_options)
res = solve(prob, alg, options = DataDrivenCommonOptions(maxiters = 100))
loglikelihood(res)
system = get_basis(res)
# This file was generated using Literate.jl, https://github.com/fredrikekre/Literate.jl
This page was generated using Literate.jl.