Next Generation Reservoir Computing

This tutorial shows how to use next generation reservoir computing NGRC in ReservoirComputing.jl to model the chaotic Lorenz system.

NGRC works differently compared to traditional reservoir computing. In NGRC the reservoir is replaced with:

A delay embedding of the input
A nonlinear feature map

The model is finally trained through ridge regression, like a normal RC.

In this tutorial we will :

simulate the Lorenz system,
build an NGRC model with delayed inputs and polynomial features, following the original paper,
train it on one-step increments,
roll it out generatively and compare with the true trajectory.

1. Setup and imports

First we need to load the necessary packages. We are going to use the following:

using OrdinaryDiffEq
using Random
using ReservoirComputing
using Plots
using Statistics

2. Define Lorenz system and generate data

We define the Lorenz system and integrate it to generate a long trajectory:

function lorenz!(du, u, p, t)
    σ, ρ, β = p
    du[1] = σ * (u[2] - u[1])
    du[2] = u[1] * (ρ - u[3]) - u[2]
    du[3] = u[1] * u[2] - β * u[3]
end

prob = ODEProblem(
    lorenz!,
    Float32[1.0, 0.0, 0.0],
    (0.0, 200.0),
    (10.0f0, 28.0f0, 8/3f0),
)

data = Array(solve(prob, ABM54(); dt = 0.025))  # size: (3, T)

3×8001 Matrix{Float32}:
 1.0  0.853731    0.870095   1.01244   …   0.773916   0.851587   0.970214
 0.0  0.628894    1.20067    1.81407       1.00348    1.24317    1.5298
 0.0  0.00700958  0.0254547  0.057893     15.2662    14.3036    13.4116

We then split the time series into training and testing segments:

shift = 300
train_len = 500
predict_len = 900

input_data = data[:, shift:(shift + train_len - 1)]
target_data = data[:, (shift + 1):(shift + train_len)]
test_data = data[:, (shift + train_len):(shift + train_len + predict_len - 1)]

3×900 Matrix{Float32}:
 -2.80319  -3.30617  -3.805    -4.33585  …  -4.03219  -3.6886   -3.50096
 -4.87985  -5.28285  -5.84278  -6.56417     -2.32134  -2.63856  -3.03897
 21.6032   20.5856   19.7363   19.0738      24.739    23.3732   22.1111

3. Normalization

It is good practice to normalize the data, especially for polynomial features:

in_mean = mean(input_data; dims = 2)
in_std = std(input_data;  dims = 2)

train_norm_x = (input_data  .- in_mean) ./ in_std
train_norm_y = (target_data .- in_mean) ./ in_std
test_norm_x  = (test_data   .- in_mean) ./ in_std

# We train an increment (residual) model: Δy = y_{t+1} − y_t
train_delta_y = train_norm_y .- train_norm_x

3×500 Matrix{Float32}:
  0.00105339  -0.0136562  -0.0277029  …  -0.129007    -0.102032   -0.0856874
 -0.0517914   -0.0628898  -0.0724724      0.00232842  -0.0147804  -0.0343755
 -0.0859328   -0.0697899  -0.0504105     -0.220857    -0.194563   -0.170794

4. Build the NGRC model

Now that we have the data we can start building the model. Following the approach of the paper we first define two feature functions:

a constant feature
a second order polynomial monomial

const_feature = x -> Float32[1.0]
poly_feature  = x -> polynomial_monomials(x; degrees = 1:2)

#5 (generic function with 1 method)

Finally, we can construct the NGRC model.

We set the following:

in_dims = 3
out_dims = 3
num_delays = 1

With in_dims=3 and num_delays=1 the delayed input length is 6. Adding the polinomial of degrees 1 and 2 will put give us 21 more. Finally, the constant term adds 1 more feature. In total we have 28 features.

We can pass the number of features to ro_dims to initialize the LinearReadout with the correct dimensions. However, unless one is planning to fry run the model without training, the train function will take care to adjust the dimensions.

Now we build the NGRC:

rng = MersenneTwister(0)

ngrc = NGRC(in_dims, out_dims; num_delays = num_delays, stride = 1, features = (const_feature, poly_feature),
    include_input = false,  # we already encode everything in the features
    ro_dims = 28,
    readout_activation = identity)

ps, st = setup(rng, ngrc)

((reservoir = NamedTuple(), states_modifiers = (NamedTuple(),), readout = (weight = Float32[0.40952146 0.18945241 … 0.49384677 0.2657143; 0.9514209 0.5826452 … 0.36021674 0.11498523; 0.9727092 0.9875822 … 0.7525219 0.34456646],)), (reservoir = (history = nothing, clock = 0, rng = Random.MersenneTwister(0, (0, 1002, 0, 42))), states_modifiers = (NamedTuple(),), readout = NamedTuple()))

At this point, ngrc is a fully specified model with:

a DelayLayer that builds a 6-dimensional delayed vector from the 3D input,
a NonlinearFeaturesLayer that maps that vector to 28 polynomial features,
a LinearReadout (28 => 3).

5. Training the NGRC readout

We now train the linear readout using ridge regression on the increment train_delta_y:

ps, st = train!(ngrc, train_norm_x, train_delta_y, ps, st;
    train_method = StandardRidge(2.5e-6))

((reservoir = NamedTuple(), states_modifiers = (NamedTuple(),), readout = (weight = [0.012568615078608202 -0.3130767720322501 … 0.01673971106242916 0.00017023951877271227; 0.11887351736169491 -1.072346997381529 … 0.1915114531838363 -0.035128558380638705; -0.09375234415761531 3.3096494712679454 … 0.03954448961534765 0.10612467302178624],)), (reservoir = (history = Float32[0.4247307; 0.044810098; -0.3092905;;], clock = 500, rng = Random.MersenneTwister(0, (0, 1002, 0, 42))), states_modifiers = (NamedTuple(),), readout = NamedTuple()))

where StandardRidge is the ridge regression provided natively by ReservoirComputing.jl.

6. Generative prediction

We now perform generative prediction on the increments to obtain the predicted time series:

single_step = copy(test_norm_x[:, 1]) # normalized initial condition
traj_norm = similar(test_norm_x, 3, predict_len)

for step in 1:predict_len
    global st
    delta_step, st = ngrc(single_step, ps, st)
    single_step .= single_step .+ delta_step # increment update in normalized space
    traj_norm[:, step] .= single_step
end

Finally, we unscale back to the original coordinates:

traj = traj_norm .* in_std .+ in_mean # size: (3, predict_len)

3×900 Matrix{Float32}:
 -3.30616  -3.80449  -4.33391  -4.92376  …   7.20849   4.8467    2.90777
 -5.28191  -5.83858  -6.55609  -7.43775     -2.96956  -3.78466  -3.99589
 20.5876   19.7407   19.0804   18.637       35.4879   32.7212   30.2604

7. Visualization

We can now compare the predicted trajectory with the true Lorenz data on the test segment:

plot(transpose(test_data)[:, 1], transpose(test_data)[:, 2], transpose(test_data)[:, 3]; label="actual");
plot!(transpose(traj)[:, 1], transpose(traj)[:, 2], transpose(traj)[:, 3]; label="predicted")