Deep Echo State Network architectures started to gain some traction recently. In this guide we illustrate how it is possible to use ReservoirComputing.jl to build a deep ESN.
The network implemented in this library is taken from . It works by stacking reservoirs on top of each other, feeding the output on one in the next. The states are obtained by merging all the inner states of the stacked reservoirs. For a more in depth explanation refer to the paper linked above. The full script for this example can be found here. This example was run on Julia v1.7.2.
For this example we are going to reuse the Lorenz data used in the Lorenz System Forecasting example.
using OrdinaryDiffEq #define lorenz system function lorenz!(du,u,p,t) du = 10.0*(u-u) du = u*(28.0-u) - u du = u*u - (8/3)*u end #solve and take data prob = ODEProblem(lorenz!, [1.0,0.0,0.0], (0.0,200.0)) data = solve(prob, ABM54(), dt=0.02) #determine shift length, training length and prediction length shift = 300 train_len = 5000 predict_len = 1250 #split the data accordingly input_data = data[:, shift:shift+train_len-1] target_data = data[:, shift+1:shift+train_len] test_data = data[:,shift+train_len+1:shift+train_len+predict_len]
Again, it is important to notice that the data needs to be formatted in a matrix with the features as rows and time steps as columns like it is done in this example. This is needed even if the time series consists of single values.
The construction of the ESN is also really similar. The only difference is that the reservoir can be fed as an array of reservoirs.
reservoirs = [RandSparseReservoir(99, radius=1.1, sparsity=0.1), RandSparseReservoir(100, radius=1.2, sparsity=0.1), RandSparseReservoir(200, radius=1.4, sparsity=0.1)] esn = ESN(input_data; variation = Default(), reservoir = reservoirs, input_layer = DenseLayer(), reservoir_driver = RNN(), nla_type = NLADefault(), states_type = StandardStates())
As it is possible to see, different sizes can be chosen for the different reservoirs. The input layer and bias can also be given as vectors, but of course they have to be of the same size of the reservoirs vector. If they are not passed as a vector, the value passed is going to be used for all the layers in the deep ESN.
In addition to using the provided functions for the construction of the layers the user can also choose to build their own matrix, or array of matrices, and feed that into the
ESN in the same way.
The training and prediction follows the usual framework:
training_method = StandardRidge(0.0) output_layer = train(esn, target_data, training_method) output = esn(Generative(predict_len), output_layer)
Note that there is a known bug at the moment with using
WeightedLayer as the input layer with the deep ESN. We are in the process of investigating and solving it. The leak coefficient for the reservoirs has to always be the same with the current implementation. This is also something we are actively looking into expanding.
- 1Gallicchio, Claudio, and Alessio Micheli. "Deep echo state network (deepesn): A brief survey." arXiv preprint arXiv:1712.04323 (2017).