Next Generation Reservoir Computing
This tutorial shows how to use next generation reservoir computing NGRC in ReservoirComputing.jl to model the chaotic Lorenz system.
NGRC works differently compared to traditional reservoir computing. In NGRC the reservoir is replaced with:
- A delay embedding of the input
- A nonlinear feature map
The model is finally trained through ridge regression, like a normal RC.
In this tutorial we will :
- simulate the Lorenz system,
- build an NGRC model with delayed inputs and polynomial features, following the original paper,
- train it on one-step increments,
- roll it out generatively and compare with the true trajectory.
1. Setup and imports
First we need to load the necessary packages. We are going to use the following:
using OrdinaryDiffEq
using Random
using ReservoirComputing
using Plots
using Statistics2. Define Lorenz system and generate data
We define the Lorenz system and integrate it to generate a long trajectory:
function lorenz!(du, u, p, t)
σ, ρ, β = p
du[1] = σ * (u[2] - u[1])
du[2] = u[1] * (ρ - u[3]) - u[2]
du[3] = u[1] * u[2] - β * u[3]
end
prob = ODEProblem(
lorenz!,
Float32[1.0, 0.0, 0.0],
(0.0, 200.0),
(10.0f0, 28.0f0, 8/3f0),
)
data = Array(solve(prob, ABM54(); dt = 0.025)) # size: (3, T)3×8001 Matrix{Float32}:
1.0 0.853731 0.870095 1.01244 … 0.773916 0.851587 0.970214
0.0 0.628894 1.20067 1.81407 1.00348 1.24317 1.5298
0.0 0.00700958 0.0254547 0.057893 15.2662 14.3036 13.4116We then split the time series into training and testing segments:
shift = 300
train_len = 500
predict_len = 900
input_data = data[:, shift:(shift + train_len - 1)]
target_data = data[:, (shift + 1):(shift + train_len)]
test_data = data[:, (shift + train_len):(shift + train_len + predict_len - 1)]3×900 Matrix{Float32}:
-2.80319 -3.30617 -3.805 -4.33585 … -4.03219 -3.6886 -3.50096
-4.87985 -5.28285 -5.84278 -6.56417 -2.32134 -2.63856 -3.03897
21.6032 20.5856 19.7363 19.0738 24.739 23.3732 22.11113. Normalization
It is good practice to normalize the data, especially for polynomial features:
in_mean = mean(input_data; dims = 2)
in_std = std(input_data; dims = 2)
train_norm_x = (input_data .- in_mean) ./ in_std
train_norm_y = (target_data .- in_mean) ./ in_std
test_norm_x = (test_data .- in_mean) ./ in_std
# We train an increment (residual) model: Δy = y_{t+1} − y_t
train_delta_y = train_norm_y .- train_norm_x3×500 Matrix{Float32}:
0.00105339 -0.0136562 -0.0277029 … -0.129007 -0.102032 -0.0856874
-0.0517914 -0.0628898 -0.0724724 0.00232842 -0.0147804 -0.0343755
-0.0859328 -0.0697899 -0.0504105 -0.220857 -0.194563 -0.1707944. Build the NGRC model
Now that we have the data we can start building the model. Following the approach of the paper we first define two feature functions:
- a constant feature
- a second order polynomial monomial
const_feature = x -> Float32[1.0]
poly_feature = x -> polynomial_monomials(x; degrees = 1:2)#5 (generic function with 1 method)Finally, we can construct the NGRC model.
We set the following:
in_dims = 3
out_dims = 3
num_delays = 11With in_dims=3 and num_delays=1 the delayed input length is 6. Adding the polinomial of degrees 1 and 2 will put give us 21 more. Finally, the constant term adds 1 more feature. In total we have 28 features.
We can pass the number of features to ro_dims to initialize the LinearReadout with the correct dimensions. However, unless one is planning to fry run the model without training, the train function will take care to adjust the dimensions.
Now we build the NGRC:
rng = MersenneTwister(0)
ngrc = NGRC(in_dims, out_dims; num_delays = num_delays, stride = 1, features = (const_feature, poly_feature),
include_input = false, # we already encode everything in the features
ro_dims = 28,
readout_activation = identity)
ps, st = setup(rng, ngrc)((reservoir = NamedTuple(), states_modifiers = (NamedTuple(),), readout = (weight = Float32[0.40952146 0.18945241 … 0.49384677 0.2657143; 0.9514209 0.5826452 … 0.36021674 0.11498523; 0.9727092 0.9875822 … 0.7525219 0.34456646],)), (reservoir = (history = nothing, clock = 0, rng = Random.MersenneTwister(0, (0, 1002, 0, 42))), states_modifiers = (NamedTuple(),), readout = NamedTuple()))At this point, ngrc is a fully specified model with:
- a
DelayLayerthat builds a 6-dimensional delayed vector from the 3D input, - a
NonlinearFeaturesLayerthat maps that vector to 28 polynomial features, - a
LinearReadout(28 => 3).
5. Training the NGRC readout
We now train the linear readout using ridge regression on the increment train_delta_y:
ps, st = train!(ngrc, train_norm_x, train_delta_y, ps, st;
train_method = StandardRidge(2.5e-6))((reservoir = NamedTuple(), states_modifiers = (NamedTuple(),), readout = (weight = [0.01256861507860827 -0.3130767720322529 … 0.01673971106242916 0.00017023951877271227; 0.11887351736169234 -1.0723469973814994 … 0.1915114531838363 -0.035128558380638705; -0.09375234415761531 3.3096494712679454 … 0.03954448961534765 0.10612467302178624],)), (reservoir = (history = Float32[0.4247307; 0.044810098; -0.3092905;;], clock = 500, rng = Random.MersenneTwister(0, (0, 1002, 0, 42))), states_modifiers = (NamedTuple(),), readout = NamedTuple()))where StandardRidge is the ridge regression provided natively by ReservoirComputing.jl.
6. Generative prediction
We now perform generative prediction on the increments to obtain the predicted time series:
single_step = copy(test_norm_x[:, 1]) # normalized initial condition
traj_norm = similar(test_norm_x, 3, predict_len)
for step in 1:predict_len
global st
delta_step, st = ngrc(single_step, ps, st)
single_step .= single_step .+ delta_step # increment update in normalized space
traj_norm[:, step] .= single_step
endFinally, we unscale back to the original coordinates:
traj = traj_norm .* in_std .+ in_mean # size: (3, predict_len)3×900 Matrix{Float32}:
-3.30616 -3.80449 -4.33391 -4.92376 … 7.20849 4.8467 2.90777
-5.28191 -5.83858 -6.55609 -7.43775 -2.96956 -3.78466 -3.99589
20.5876 19.7407 19.0804 18.637 35.4879 32.7212 30.26047. Visualization
We can now compare the predicted trajectory with the true Lorenz data on the test segment:
plot(transpose(test_data)[:, 1], transpose(test_data)[:, 2], transpose(test_data)[:, 3]; label="actual");
plot!(transpose(traj)[:, 1], transpose(traj)[:, 2], transpose(traj)[:, 3]; label="predicted")