Lorenz System Forecasting

This example expands on the readme Lorenz system forecasting to showcase how to use models and methods provided in the library for Echo State Networks.

Generating the data

Starting off the workflow, the first step is to obtain the data. We use OrdinaryDiffEq to derive the Lorenz system data. The data is passed to the model as a matrix, where the columns represent the time steps.

using OrdinaryDiffEq

#define lorenz system
function lorenz!(du, u, p, t)
    du[1] = 10.0 * (u[2] - u[1])
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end

#solve and take data
prob = ODEProblem(lorenz!, [1.0, 0.0, 0.0], (0.0, 200.0))
data = solve(prob, ABM54(); dt=0.02)
data = reduce(hcat, data.u)

3×10001 Matrix{Float64}:
 1.0  0.868014    0.846922   0.912746   …  -5.41248   -4.48696   -3.72246
 0.0  0.511611    0.972162   1.43659       -0.380216  -0.265298  -0.291495
 0.0  0.00464843  0.0167238  0.0363891     29.841     28.3209    26.8714

Now we split the data in training and testing. To do an autoregressive forecast we want the model to be trained on the next step, so we are going to shift the target data by one. Additionally, we discard the transient period.

#determine shift length, training length and prediction length
shift = 300
train_len = 5000
predict_len = 1250

#split the data accordingly
input_data = data[:, shift:(shift + train_len - 1)]
target_data = data[:, (shift + 1):(shift + train_len)]
test_data = data[:, (shift + train_len + 1):(shift + train_len + predict_len)]

3×1250 Matrix{Float64}:
  4.2235    3.83533   3.55846   3.38262  …   9.44212  10.0034  10.5085
  1.98782   2.18137   2.43519   2.7396      12.3382   12.6956  12.8396
 25.6556   24.4858   23.3794   22.3394      24.3255   25.4374  26.6729

It is important to notice that the data needs to be formatted in a matrix with the features as rows and time steps as columns as in this example This is needed even if the time series consists of single values.

Building the Echo State Network

Once the data is ready, it is possible to define the parameters for the ESN and the ESN struct itself. In this example, the values from (Pathak et al., 2017) are loosely followed as general guidelines.

using ReservoirComputing

#define ESN parameters
res_size = 300
in_size = 3
res_radius = 1.2
res_sparsity = 6 / 300
input_scaling = 0.1

#build ESN struct
esn = ESN(in_size, res_size, in_size; #autoregressive so in_size = out_size
    init_reservoir = rand_sparse(; radius = res_radius, sparsity = res_sparsity),
    init_input = weighted_init(; scaling = input_scaling),
    state_modifiers = NLAT2
)

ESN(
    reservoir = StatefulLayer(ESNCell(3 => 300, use_bias=false)),
    state_modifiers = (WrappedFunction(NLAT2)),
    readout = LinearReadout(300 => 3, use_bias=false, include_collect=true)
)

In this case, a size of 300 has been chosen, so the reservoir matrix will be 300 x 300. However, this is not always the case, since some input layer constructions can modify the dimensions of the reservoir. Please make sure to read the API documentation of the initializer you intend to use if you think that is cause of errors.

The res_radius determines the scaling of the spectral radius of the reservoir matrix; a proper scaling is necessary to assure the Echo State Property. The default value in the rand_sparse method is 1.0 in accordance with the most commonly followed guidelines found in the literature (see (Lukoševičius, 2012) and references therein).

The value of input_scaling determines the upper and lower bounds of the uniform distribution of the weights in the weighted_init. The value of 0.1 represents the default. The default input layer is the scaled_rand, a dense matrix. The details of the weighted version can be found in (Lu et al., 2017), for this example, this version returns the best results.

Training and Prediction

Training for ESNs usually means solving a linear regression. The libbrary supports solvers from 'MLILinearModels.jl', in addition to a custom implementation of ridge regression. In this example we will use the latter.

Since ReservoirComputing.jl builds on LuxCore.jl we first need to setup the state and the parameters

using Random
Random.seed!(42)
rng = MersenneTwister(17)

ps, st = setup(rng, esn)

((reservoir = (input_matrix = Float32[0.07099056 0.0 0.0; 0.044342138 0.0 0.0; … ; 0.0 0.0 0.04174912; 0.0 0.0 -0.017985344], reservoir_matrix = Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]), states_modifiers = (NamedTuple(),), readout = (weight = Float32[0.14745247 0.31642652 … 0.79039073 0.70270693; 0.50063944 0.76482 … 0.04451108 0.991169; 0.54582596 0.87618923 … 0.6872307 0.841046],)), (reservoir = (cell = (rng = Random.MersenneTwister(17, (0, 222892, 221442, 206)),), carry = nothing), states_modifiers = (NamedTuple(),), readout = NamedTuple()))

Now we can proceed with training the ESN model. Usually an initial transient is discarded, to account for the dynamics of the ESN to settle. This can be done by passing the washout keyword argument to train.

#define training method
training_method = StandardRidge(0.0)

ps, st = train!(esn, input_data, target_data, ps, st, training_method)

((reservoir = (input_matrix = Float32[0.07099056 0.0 0.0; 0.044342138 0.0 0.0; … ; 0.0 0.0 0.04174912; 0.0 0.0 -0.017985344], reservoir_matrix = Float32[0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]), states_modifiers = (NamedTuple(),), readout = (weight = [1.2487366212787159 0.3608088980889213 … -4.157544427654357 -1.0415952644203579; -5.331389418381886 1.150675640316377 … -3.5291204282706916 2.4630478845661146; -2.992302934735261 -5.052152248931234 … -28.267005934702695 -7.0141298550094975],)), (reservoir = (cell = (rng = Random.MersenneTwister(17, (0, 222892, 221442, 511)),), carry = ([0.8277185641321425; 0.41136255864703053; … ; 0.95331859880098; -0.7840850804516452;;],)), states_modifiers = (NamedTuple(),), readout = NamedTuple()))

ps now contains the trained parameters for the ESN.

Returning training states

The ESN states are internally used the training, however they are not returned by default. To inspect the states, it is necessary to set the boolean keyword argument return_states as true in the train! call.

ReservoirComputing.jl provides additional utilities functions for autoregressive forecasting:

output, st = predict(esn, predict_len, ps, st; initialdata=test_data[:, 1])

([3.9612472483120205 3.674253796184256 … 5.148217913096575 4.98423817470443; 2.5445960824137757 3.0314970938102164 … 4.101730831742827 4.3886715639620135; 24.290769758470894 23.162757704157244 … 24.765933294389278 23.89736653858111], (reservoir = (cell = (rng = Random.MersenneTwister(17, (0, 222892, 221442, 511)),), carry = ([0.7948452772731637; 0.34587147188761536; … ; 0.935299377992348; -0.7474156250474757;;],)), states_modifiers = (NamedTuple(),), readout = NamedTuple()))

To inspect the results, they can easily be plotted using an external library. In this case, we will use Plots.jl:

using Plots, Plots.PlotMeasures

ts = 0.0:0.02:200.0
lorenz_maxlyap = 0.9056
predict_ts = ts[(shift + train_len + 1):(shift + train_len + predict_len)]
lyap_time = (predict_ts .- predict_ts[1]) * (1 / lorenz_maxlyap)

p1 = plot(lyap_time, [test_data[1, :] output[1, :]]; label=["actual" "predicted"],
    ylabel="x(t)", linewidth=2.5, xticks=false, yticks=-15:15:15);
p2 = plot(lyap_time, [test_data[2, :] output[2, :]]; label=["actual" "predicted"],
    ylabel="y(t)", linewidth=2.5, xticks=false, yticks=-20:20:20);
p3 = plot(lyap_time, [test_data[3, :] output[3, :]]; label=["actual" "predicted"],
    ylabel="z(t)", linewidth=2.5, xlabel="max(λ)*t", yticks=10:15:40);

plot(p1, p2, p3; plot_title="Lorenz System Coordinates",
    layout=(3, 1), xtickfontsize=12, ytickfontsize=12, xguidefontsize=15,
    yguidefontsize=15,
    legendfontsize=12, titlefontsize=20)

References

Lu, Z.; Pathak, J.; Hunt, B.; Girvan, M.; Brockett, R. and Ott, E. (2017). Reservoir observers: Model-free inference of unmeasured variables in chaotic systems. Chaos: An Interdisciplinary Journal of Nonlinear Science 27.
Lukoševičius, M. (2012). A Practical Guide to Applying Echo State Networks. In: Neural Networks: Tricks of the Trade (Springer Berlin Heidelberg); pp. 659–686.
Pathak, J.; Lu, Z.; Hunt, B. R.; Girvan, M. and Ott, E. (2017). Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos: An Interdisciplinary Journal of Nonlinear Science 27.