PETSc SNES Example 2
This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.
This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.
import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools: @benchmark
u0 = fill(0.5, 128)
function form_residual!(resid, x, _)
n = length(x)
xp = LinRange(0.0, 1.0, n)
F = 6xp .+ (xp .+ 1e-12) .^ 6
dx = 1 / (n - 1)
resid[1] = x[1]
for i in 2:(n - 1)
resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
end
resid[n] = x[n] - 1
return
endform_residual! (generic function with 1 method)To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.
nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())
nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
⋮
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.
sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes128-element Vector{Float64}:
-1.1693707601222758e-18
-2.060615846059191e-12
-4.121188469085875e-12
-6.1815928843923654e-12
-8.241618880953407e-12
-1.0300972287614549e-11
-1.235927521079077e-11
-1.4416066349562731e-11
-1.647080155641259e-11
-1.8522854419982665e-11
⋮
-1.530120474768637e-11
-1.3394396702892664e-11
-1.148503514514232e-11
-9.57367518594765e-12
-7.660760914518505e-12
-5.74651437545981e-12
-3.831490680283878e-12
-1.91580085129317e-12
0.0sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes128-element Vector{Float64}:
-2.0178697886277366e-42
9.553578732962847e-18
1.9107369224162507e-17
2.8660206803296506e-17
3.8211350316535997e-17
4.777265822514254e-17
5.732718987017105e-17
6.686816898804349e-17
7.643625316022806e-17
8.59772322781005e-17
⋮
-2.220446049250313e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
0.0
-1.1102230246251565e-16
0.0
0.0
0.0As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.
Runtimes
Dense Jacobian
@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 259 samples with 1 evaluation per sample.
Range (min … max): 18.760 ms … 52.298 ms ┊ GC (min … max): 0.00% … 62.18%
Time (median): 19.096 ms ┊ GC (median): 0.00%
Time (mean ± σ): 19.335 ms ± 2.448 ms ┊ GC (mean ± σ): 1.07% ± 5.04%
▂ ▄▃ █▇▆▁▃▃▄▂ ▄
▃▁▄▁▃▃▃█▅▆▆▆▆███████████▅▇█▆██▇▇▇▇▇▆▄▁▃▁▅▃▄▅▅▃▅▃▁▁▃▁▁▁▃▁▃▃▃ ▄
18.8 ms Histogram: frequency by time 19.7 ms <
Memory estimate: 1.32 MiB, allocs estimate: 1122.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 542 samples with 1 evaluation per sample.
Range (min … max): 8.795 ms … 11.431 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.224 ms ┊ GC (median): 0.00%
Time (mean ± σ): 9.234 ms ± 167.749 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▃▃▄▄▄▆█▅▂▅▇▅█▃▃▇▃▂
▃▂▁▁▁▁▁▁▁▁▂▁▃▃▅▅▇███████████████████▇██▆▆▅▅▄▄▃▁▁▂▁▂▁▃▂▃▁▁▁▂ ▄
8.8 ms Histogram: frequency by time 9.7 ms <
Memory estimate: 229.66 KiB, allocs estimate: 452.Sparse Jacobian
@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 4335 samples with 1 evaluation per sample.
Range (min … max): 1.013 ms … 75.266 ms ┊ GC (min … max): 0.00% … 74.79%
Time (median): 1.087 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.153 ms ± 1.727 ms ┊ GC (mean ± σ): 3.99% ± 2.81%
▂▄██▇▅▆▄▄▄▂▄▄▃▂
▂▂▂▃▃▃▄▃▅▇█████████████████▆▆▆▅▄▄▃▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂ ▄
1.01 ms Histogram: frequency by time 1.25 ms <
Memory estimate: 253.53 KiB, allocs estimate: 2685.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 5336 samples with 1 evaluation per sample.
Range (min … max): 722.749 μs … 542.361 ms ┊ GC (min … max): 0.00% … 28.89%
Time (median): 841.937 μs ┊ GC (median): 0.00%
Time (mean ± σ): 936.381 μs ± 7.414 ms ┊ GC (mean ± σ): 3.14% ± 0.40%
▁ ▃ ▂▃▆█▅▅▂▄▂▁
▁▂▄▆▆█▇███▇▇▇██▇▆▅▆▅▇▇████████████▆▆▅▅▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▂▁▁▁▂▂▁▁ ▄
723 μs Histogram: frequency by time 1.02 ms <
Memory estimate: 185.41 KiB, allocs estimate: 2392.