Deep Echo State Networks
Deep Echo State Network architectures started to gain some traction recently. In this guide, we illustrate how it is possible to use ReservoirComputing.jl to build a deep ESN.
The network implemented in this library is taken from (Gallicchio and Micheli, 2017). It works by stacking reservoirs on top of each other, feeding the output from one into the next. The states are obtained by merging all the inner states of the stacked reservoirs. For a more in-depth explanation, refer to the paper linked above.
Lorenz Example
For this example, we are going to reuse the Lorenz data used in the Lorenz System Forecasting example.
using OrdinaryDiffEq
#define lorenz system
function lorenz!(du, u, p, t)
du[1] = 10.0 * (u[2] - u[1])
du[2] = u[1] * (28.0 - u[3]) - u[2]
du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
#solve and take data
prob = ODEProblem(lorenz!, [1.0, 0.0, 0.0], (0.0, 200.0))
data = solve(prob, ABM54(); dt=0.02)
data = reduce(hcat, data.u)
#determine shift length, training length and prediction length
shift = 300
train_len = 5000
predict_len = 1250
#split the data accordingly
input_data = data[:, shift:(shift + train_len - 1)]
target_data = data[:, (shift + 1):(shift + train_len)]
test_data = data[:, (shift + train_len + 1):(shift + train_len + predict_len)]
3×1250 Matrix{Float64}:
4.2235 3.83533 3.55846 3.38262 … 9.44212 10.0034 10.5085
1.98782 2.18137 2.43519 2.7396 12.3382 12.6956 12.8396
25.6556 24.4858 23.3794 22.3394 24.3255 25.4374 26.6729
Again, it is important to notice that the data needs to be formatted in a matrix, with the features as rows and time steps as columns, as in this example. This is needed even if the time series consists of single values.
The construction of the ESN is also really similar. The only difference is that the reservoir can be fed as an array of reservoirs.
using ReservoirComputing
reservoirs = [rand_sparse(; radius=1.1, sparsity=0.1),
rand_sparse(; radius=1.2, sparsity=0.1),
rand_sparse(; radius=1.4, sparsity=0.1)]
esn = DeepESN(input_data, 3, 200;
reservoir=reservoirs,
reservoir_driver=RNN(),
nla_type=NLADefault(),
states_type=StandardStates())
DeepESN{Int64, Matrix{Float64}, NLADefault, Vector{Matrix{Float64}}, RNN{typeof(NNlib.tanh_fast), Float64}, Vector{Matrix{Float64}}, Vector{Vector{Float32}}, StandardStates, Int64, Matrix{Float64}}(200, [-9.903289241214775 -9.691839850827593 … 6.13395766702855 5.3690197189757916; -9.018088588651105 -8.473011482849701 … 1.9972030630363637 1.862732065250123; 29.81517777140388 29.93605690750639 … 29.48232612752881 28.165385874059083], NLADefault(), [[0.013907174655313104 -0.05716939898613221 -0.006155524219557851; -0.08855201872557712 -0.006872478885491318 -0.024412371250722355; … ; -0.05567417608791839 0.020958240616840065 -0.05463514734412738; 0.08647650301013335 -0.004830822042974714 0.09832851499374105], [0.04213248726958181 -0.08713980785528797 … 0.09277984251585046 0.07165217731267358; 0.0594666227164502 -0.017200683557866192 … 0.001515976012656517 -0.016257560549243967; … ; -0.061682611611786814 -0.05278295296902507 … -0.0013732916261887996 -0.05357622749813545; -0.04561198555057036 -0.055659463829441695 … -0.037538777812366565 -0.06064975265589079]], RNN{typeof(NNlib.tanh_fast), Float64}(NNlib.tanh_fast, 1.0), [[0.0 0.0 … 0.0 0.0; 0.0 0.03679667105415618 … 0.0 -0.1504413499781463; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], [0.1378192372786613 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0]], Vector{Float32}[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 … 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 … 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]], StandardStates(), 0, [0.191895004462442 -0.8727291477054525 … -0.9275994264160081 -0.9284136807209246; 0.207993968483316 0.966678677968091 … 0.12853425853116815 0.2223888777182524; … ; 0.0 -0.659747033987273 … -0.9023082828245332 -0.9005549127379939; 0.0 -0.12214704901669225 … 0.01667419825698542 -0.04947580062109448])
The input layer and bias can also be given as vectors, but of course, they have to be of the same size of the reservoirs vector. If they are not passed as a vector, the value passed will be used for all the layers in the deep ESN.
In addition to using the provided functions for the construction of the layers, the user can also choose to build their own matrix, or array of matrices, and feed that into the ESN
in the same way.
The training and prediction follow the usual framework:
training_method = StandardRidge(0.0)
output_layer = train(esn, target_data, training_method)
output = esn(Generative(predict_len), output_layer)
3×1250 Matrix{Float64}:
4.22356 3.83532 3.5583 3.38233 … -7.58927 -8.00124 -8.44168
1.98757 2.18088 2.43452 2.73873 -9.55775 -10.1418 -10.6934
25.6556 24.4857 23.3794 22.3395 -22.8437 -23.1538 -23.6205
Plotting the results:
using Plots
ts = 0.0:0.02:200.0
lorenz_maxlyap = 0.9056
predict_ts = ts[(shift + train_len + 1):(shift + train_len + predict_len)]
lyap_time = (predict_ts .- predict_ts[1]) * (1 / lorenz_maxlyap)
p1 = plot(lyap_time, [test_data[1, :] output[1, :]]; label=["actual" "predicted"],
ylabel="x(t)", linewidth=2.5, xticks=false, yticks=-15:15:15);
p2 = plot(lyap_time, [test_data[2, :] output[2, :]]; label=["actual" "predicted"],
ylabel="y(t)", linewidth=2.5, xticks=false, yticks=-20:20:20);
p3 = plot(lyap_time, [test_data[3, :] output[3, :]]; label=["actual" "predicted"],
ylabel="z(t)", linewidth=2.5, xlabel="max(λ)*t", yticks=10:15:40);
plot(p1, p2, p3; plot_title="Lorenz System Coordinates",
layout=(3, 1), xtickfontsize=12, ytickfontsize=12, xguidefontsize=15,
yguidefontsize=15,
legendfontsize=12, titlefontsize=20)
References
- Gallicchio, C. and Micheli, A. (2017). Deep echo state network (deepesn): A brief survey, arXiv preprint arXiv:1712.04323.