EnsembleGPUKernel
API
DiffEqGPU.EnsembleGPUKernel
— TypeEnsembleGPUKernel(backend,cpu_offload = 0.2)
A massively-parallel ensemble algorithm which generates a unique GPU kernel for the entire ODE which is then executed. This leads to a very low overhead GPU code generation, but imparts some extra limitations on the use.
Positional Arguments
backend
: the KernelAbstractions backend for performing the computation.cpu_offload
: the percentage of trajectories to offload to the CPU. Default is 0.0 or 0% of trajectories.
Limitations
Not all standard Julia
f
functions are allowed. Only Juliaf
functions which are capable of being compiled into a GPU kernel are allowed. This notably means that certain features of Julia can cause issues inside a kernel, like:- Allocating memory (building arrays)
- Linear algebra (anything that calls BLAS)
- Broadcast
Only out-of-place
f
definitions are allowed. Coupled with the requirement of not allowing for memory allocations, this means that the ODE must be defined withStaticArray
initial conditions.Only specific ODE solvers are allowed. This includes:
- GPUTsit5
- GPUVern7
- GPUVern9
To use multiple GPUs over clusters, one must manually set up one process per GPU. See the multi-GPU tutorial for more details.
Example
using DiffEqGPU, CUDA, OrdinaryDiffEq, StaticArrays
function lorenz(u, p, t)
σ = p[1]
ρ = p[2]
β = p[3]
du1 = σ * (u[2] - u[1])
du2 = u[1] * (ρ - u[3]) - u[2]
du3 = u[1] * u[2] - β * u[3]
return SVector{3}(du1, du2, du3)
end
u0 = @SVector [1.0f0; 0.0f0; 0.0f0]
tspan = (0.0f0, 10.0f0)
p = @SVector [10.0f0, 28.0f0, 8 / 3.0f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p = (@SVector rand(Float32, 3)) .* p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy = false)
@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(), trajectories = 10_000,
adaptive = false, dt = 0.1f0)
Specialized Solvers
DiffEqGPU.GPUTsit5
— TypeGPUTsit5()
A specialized implementation of the 5th order Tsit5
method specifically for kernel generation with EnsembleGPUKernel. For a similar CPU implementation, see SimpleATsit5 from SimpleDiffEq.jl.
DiffEqGPU.GPUVern7
— TypeGPUVern7()
A specialized implementation of the 7th order Vern7
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPUVern9
— TypeGPUVern9()
A specialized implementation of the 9th order Vern9
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPUEM
— TypeGPUEM()
A specialized implementation of the Euler-Maruyama GPUEM
method with weak order 1.0. Made specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPUSIEA
— TypeGPUSIEA()
A specialized implementation of the weak order 2.0 for Ito SDEs GPUSIEA
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPURosenbrock23
— TypeGPURosenbrock23()
A specialized implementation of the W-method Rosenbrock23
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPURodas4
— TypeGPURodas4()
A specialized implementation of the Rodas4
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPURodas5P
— TypeGPURodas5P()
A specialized implementation of the Rodas5P
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPUKvaerno3
— TypeGPUKvaerno3()
A specialized implementation of the Kvaerno3
method specifically for kernel generation with EnsembleGPUKernel.
DiffEqGPU.GPUKvaerno5
— TypeGPUKvaerno5()
A specialized implementation of the Kvaerno5
method specifically for kernel generation with EnsembleGPUKernel.
Lower Level API
DiffEqGPU.vectorized_solve
— Functionvectorized_solve(probs, prob::Union{ODEProblem, SDEProblem}alg;
dt, saveat = nothing,
save_everystep = true,
debug = false, callback = CallbackSet(nothing), tstops = nothing)
A lower level interface to the kernel generation solvers of EnsembleGPUKernel with fixed time-stepping.
Arguments
probs
: the GPU-setup problems generated by the ensemble.prob
: the quintessential problem form. Can be justprobs[1]
alg
: the kernel-based differential equation solver. Must be one of the EnsembleGPUKernel specialized methods.
Keyword Arguments
Only a subset of the common solver arguments are supported.
DiffEqGPU.vectorized_asolve
— Functionvectorized_asolve(probs, prob::ODEProblem, alg;
dt = 0.1f0, saveat = nothing,
save_everystep = false,
abstol = 1.0f-6, reltol = 1.0f-3,
callback = CallbackSet(nothing), tstops = nothing)
A lower level interface to the kernel generation solvers of EnsembleGPUKernel with adaptive time-stepping.
Arguments
probs
: the GPU-setup problems generated by the ensemble.prob
: the quintessential problem form. Can be justprobs[1]
alg
: the kernel-based differential equation solver. Must be one of the EnsembleGPUKernel specialized methods.
Keyword Arguments
Only a subset of the common solver arguments are supported.
DiffEqGPU.vectorized_map_solve
— FunctionLower level API for EnsembleArrayAlgorithm
. Avoids conversion of solution to CPU arrays.
vectorized_map_solve(probs, alg,
ensemblealg::Union{EnsembleArrayAlgorithm}, I,
adaptive)
Arguments
probs
: the GPU-setup problems generated by the ensemble.alg
: the kernel-based differential equation solver. Most of the solvers from OrdinaryDiffEq.jl are supported.ensemblealg
: TheEnsembleGPUArray()
algorithm.I
: The iterator argument. Can be set to for e.g. 1:10_000 to simulate 10,000 trajectories.adaptive
: The Boolean argument for time-stepping. Usetrue
to enable adaptive time-stepping.
Keyword Arguments
Only a subset of the common solver arguments are supported.