Problems

DataDrivenDiffEq.DataDrivenProblemType
struct DataDrivenProblem{dType, cType, probType} <: DataDrivenDiffEq.AbstractDataDrivenProblem{dType, cType, probType}

The DataDrivenProblem defines a general estimation problem given measurements, inputs and (in the near future) observations. Three construction methods are available:

  • DirectDataDrivenProblem for direct mappings
  • DiscreteDataDrivenProblem for time discrete systems
  • ContinuousDataDrivenProblem for systems continuous in time

where all are aliases for constructing a problem.

Fields

  • X: State measurements

  • t: Time measurements (optional)

  • DX: Differential state measurements (optional); Used for time continuous problems

  • Y: Output measurements (optional); Used for direct problems

  • U: Input measurements (optional); Used for non-autonomous problems

  • p: Parameters associated with the problem (optional)

  • name: Name of the problem

Signatures

Example

X, DX, t = data...

# Define a discrete time problem
prob = DiscreteDataDrivenProblem(X)

# Define a continuous time problem without explicit time points
prob = ContinuousDataDrivenProblem(X, DX)

# Define a continuous time problem without explicit derivatives
prob = ContinuousDataDrivenProblem(X, t)

# Define a discrete time problem with an input function as a function
input_signal(u, p, t) = t^2
prob = DiscreteDataDrivenProblem(X, t, input_signal)
source

Defining a Problem

Problems of identification, estimation, or inference are defined by data. These data contain at least measurements of the states X, which would be sufficient to describe a DiscreteDataDrivenProblem with unit time steps similar to the first example on dynamic mode decomposition. Of course, we can extend this to include time points t, control signals U or a function describing those u(x,p,t). Additionally, any parameters p known a priori can be included in the problem. In practice, this looks like:

problem = DiscreteDataDrivenProblem(X)
problem = DiscreteDataDrivenProblem(X, t)
problem = DiscreteDataDrivenProblem(X, t, U)
problem = DiscreteDataDrivenProblem(X, t, U, p = p)
problem = DiscreteDataDrivenProblem(X, t, (x, p, t) -> u(x, p, t))

Similarly, a ContinuousDataDrivenProblem would need at least measurements and time-derivatives (X and DX) or measurements, time information, and a way to derive the time derivatives (X, t, and a Collocation method). Again, this can be extended by including a control input as measurements or a function and possible parameters:

# Using available data
problem = ContinuousDataDrivenProblem(X, DX)
problem = ContinuousDataDrivenProblem(X, t, DX)
problem = ContinuousDataDrivenProblem(X, t, DX, U, p = p)
problem = ContinuousDataDrivenProblem(X, t, DX, (x, p, t) -> u(x, p, t))

# Using collocation
problem = ContinuousDataDrivenProblem(X, t, InterpolationMethod())
problem = ContinuousDataDrivenProblem(X, t, GaussianKernel())
problem = ContinuousDataDrivenProblem(X, t, U, InterpolationMethod())
problem = ContinuousDataDrivenProblem(X, t, U, GaussianKernel(), p = p)

You can also directly use a DESolution as an input to your DataDrivenProblem:

problem = DataDrivenProblem(sol; kwargs...)

which evaluates the function at the specific timepoints t using the parameters p of the original problem instead of using the interpolation. If you want to use the interpolated data, add the additional keyword use_interpolation = true.

An additional type of problem is the DirectDataDrivenProblem, which does not assume any kind of causal relationship. It is defined by X and an observed output Y in addition to the usual arguments:

problem = DirectDataDrivenProblem(X, Y)
problem = DirectDataDrivenProblem(X, t, Y)
problem = DirectDataDrivenProblem(X, t, Y, U)
problem = DirectDataDrivenProblem(X, t, Y, p = p)
problem = DirectDataDrivenProblem(X, t, Y, (x, p, t) -> u(x, p, t), p = p)

Working with Real Data

When working with experimental data from files (e.g., CSV), the data must be formatted correctly before creating a problem. The key points are:

  • States X: A matrix of shape (n_states, n_timepoints) where each column is a measurement at a time point
  • Times t: A vector of length n_timepoints
  • Controls U: Either a matrix of shape (n_controls, n_timepoints) or a function (x, p, t) -> u_vector

Loading Data from CSV

using CSV, DataFrames

# Load your experimental data
df = CSV.read("experiment.csv", DataFrame)

# Extract time points
t = Vector(df.time)

# Extract state measurements (transpose so columns are time points)
X = permutedims(Matrix(df[:, [:x1, :x2]]))

# Extract control measurements
U = permutedims(Matrix(df[:, [:u1]]))

# Create the problem
prob = ContinuousDataDrivenProblem(X, t, U = U)

Time-Varying Controls from Data

Control inputs can be specified in two ways:

  1. As measured data (matrix): Use this when you have control values recorded at each time point

    U = [u1_at_t1 u1_at_t2 ... u1_at_tn;
         u2_at_t1 u2_at_t2 ... u2_at_tn]  # Shape: (n_controls, n_timepoints)
    prob = ContinuousDataDrivenProblem(X, t, U = U)
  2. As a function: Use this when controls can be computed analytically or when you want to interpolate measured data

    # Using DataInterpolations.jl to create a continuous function from discrete data
    using DataInterpolations
    u_interp = LinearInterpolation(vec(U), t)
    control_func(x, p, t) = [u_interp(t)]
    prob = ContinuousDataDrivenProblem(X, t, U = control_func)

For a complete example, see Using Real Data with Time-Varying Controls.

Concrete Types

DataDrivenDiffEq.ContinuousDataDrivenProblemFunction

A time continuous DataDrivenProblem useable for problems of the form f(x,p,t,u) ↦ dx/dt.

ContinuousDataDrivenProblem(X, DX; kwargs...)

Automatically constructs derivatives via an additional collocation method, which can be either a collocation or an interpolation from DataInterpolations.jl wrapped by an InterpolationMethod.

source

Datasets

DataDrivenDiffEq.DataDrivenDatasetType
struct DataDrivenDataset{N, U, C} <: DataDrivenDiffEq.AbstractDataDrivenProblem{N, U, C}

A collection of DataDrivenProblems used to concatenate different trajectories or experiments.

Can be called with either a NTuple of problems or a NamedTuple of NamedTuples. Similar to the DataDrivenProblem, it has three constructors available:

  • DirectDataset for direct problems
  • DiscreteDataset for discrete problems
  • ContinuousDataset for continuous problems

Fields

  • name: Name of the dataset

  • probs: The problems

  • sizes: The length of each problem - for internal use

Signatures

source

A DataDrivenDataset collects several DataDrivenProblems of the same type but treats them as a union for system identification.

Concrete Types

DataDrivenDiffEq.ContinuousDatasetFunction

A time continuous DataDrivenDataset useable for problems of the form f(x,p,t,u) ↦ dx/dt.

ContinuousDataset(s; name, collocation, kwargs...)

Automatically constructs derivatives via an additional collocation method, which can be either a collocation or an interpolation from DataInterpolations.jl wrapped by an InterpolationMethod provided by the collocation keyword argument.

source

API

These methods are defined for DataDrivenProblems, but might be useful for developers.

DataDrivenDiffEq.is_validFunction
is_valid(x)

Checks if a DataDrivenProblem is valid by checking if the data contains NaN, Inf and if the number of measurements is consistent.

Example

is_valid(problem)
source