PETSc SNES Example 2
This implements src/snes/examples/tutorials/ex2.c
from PETSc and examples/SNES_ex2.jl
from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl
.
This solves the equations sequentially. Newton method to solve u'' + u^{2} = f
, sequentially.
import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools
u0 = fill(0.5, 128)
function form_residual!(resid, x, _)
n = length(x)
xp = LinRange(0.0, 1.0, n)
F = 6xp .+ (xp .+ 1e-12) .^ 6
dx = 1 / (n - 1)
resid[1] = x[1]
for i in 2:(n - 1)
resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
end
resid[n] = x[n] - 1
return
end
form_residual! (generic function with 1 method)
To use automatic sparsity detection, we need to specify sparsity
keyword argument to NonlinearFunction
. See Automatic Sparsity Detection for more details.
nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())
nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)
NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
⋮
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
Now we can solve the problem using PETScSNES
or with one of the native NonlinearSolve.jl
solvers.
sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes
128-element Vector{Float64}:
4.1386421557519005e-19
-2.1053427420479262e-17
-4.252105395216588e-17
-6.398825696737886e-17
-8.545545998259185e-17
-1.0692266299780484e-16
-1.2839664227659586e-16
-1.4986384529180885e-16
-1.713039432527097e-16
-1.927711462679227e-16
⋮
-6.661338147750939e-16
-6.661338147750939e-16
-5.551115123125783e-16
-4.440892098500626e-16
-3.3306690738754696e-16
-2.220446049250313e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
0.0
sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes
128-element Vector{Float64}:
-1.9281866869109483e-42
-6.886907256769496e-18
-1.3773602755302178e-17
-2.0660827649426894e-17
-2.755228770828788e-17
-3.4436971503570835e-17
-4.132165529885379e-17
-4.819278656698067e-17
-5.51316804708879e-17
-6.196215415754658e-17
⋮
-5.551115123125783e-16
-4.440892098500626e-16
-4.440892098500626e-16
-3.3306690738754696e-16
-2.220446049250313e-16
-2.220446049250313e-16
-1.1102230246251565e-16
0.0
0.0
As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.
Runtimes
Dense Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 4177 samples with 1 evaluation per sample.
Range (min … max): 1.135 ms … 2.819 ms ┊ GC (min … max): 0.00% … 53.61%
Time (median): 1.172 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.192 ms ± 116.481 μs ┊ GC (mean ± σ): 0.67% ± 3.94%
▁▄▇██▇▅▅▃▂
▂▃▄▅▇███████████▇▆▆▅▅▄▄▄▃▃▃▃▃▃▃▃▄▄▃▃▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▄
1.13 ms Histogram: frequency by time 1.32 ms <
Memory estimate: 739.89 KiB, allocs estimate: 209.
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 1049 samples with 1 evaluation per sample.
Range (min … max): 4.563 ms … 30.845 ms ┊ GC (min … max): 0.00% … 7.16%
Time (median): 4.698 ms ┊ GC (median): 0.00%
Time (mean ± σ): 4.763 ms ± 842.843 μs ┊ GC (mean ± σ): 0.04% ± 0.22%
▂▄▂▃▂█▇▃▃▁ ▁ ▂ ▂▃
▃███████████▇▅▅███▇▇▇██▇▇▇▆▅▄▄▂▃▂▃▂▂▂▃▂▁▁▂▂▂▂▂▂▁▁▁▁▁▁▁▂▂▁▁▂ ▄
4.56 ms Histogram: frequency by time 5.25 ms <
Memory estimate: 219.96 KiB, allocs estimate: 280.
Sparse Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 417.950 μs … 8.344 ms ┊ GC (min … max): 0.00% … 45.48%
Time (median): 439.861 μs ┊ GC (median): 0.00%
Time (mean ± σ): 472.873 μs ± 472.348 μs ┊ GC (mean ± σ): 2.86% ± 2.70%
▂▇█▇▆▂ ▁▃▅▄▃▁
▁▁▁▂▅███████▆███████▇▆▄▄▃▃▂▂▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
418 μs Histogram: frequency by time 516 μs <
Memory estimate: 540.27 KiB, allocs estimate: 1963.
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 9730 samples with 1 evaluation per sample.
Range (min … max): 400.112 μs … 60.441 ms ┊ GC (min … max): 0.00% … 13.98%
Time (median): 431.685 μs ┊ GC (median): 0.00%
Time (mean ± σ): 512.442 μs ± 2.070 ms ┊ GC (mean ± σ): 2.21% ± 0.54%
▂▄▆▇██▆▅▅▅▄▂
▂▃▄▆████████████▇▆▄▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▁▂▂▂▂▂▂▁▂▂▁▂▂▂ ▄
400 μs Histogram: frequency by time 585 μs <
Memory estimate: 179.79 KiB, allocs estimate: 1496.