PETSc SNES Example 2
This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.
This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.
import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools
u0 = fill(0.5, 128)
function form_residual!(resid, x, _)
n = length(x)
xp = LinRange(0.0, 1.0, n)
F = 6xp .+ (xp .+ 1e-12) .^ 6
dx = 1 / (n - 1)
resid[1] = x[1]
for i in 2:(n - 1)
resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
end
resid[n] = x[n] - 1
return
endform_residual! (generic function with 1 method)To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.
nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())
nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
⋮
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.
sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes128-element Vector{Float64}:
-5.522508045665029e-19
5.589962171669233e-17
1.123521441897049e-16
1.68803501992415e-16
2.252565538610196e-16
2.817096057296242e-16
3.3816265759822883e-16
3.946224857304115e-16
4.510823138625941e-16
5.075150369404646e-16
⋮
9.992007221626409e-16
8.881784197001252e-16
7.771561172376096e-16
6.661338147750939e-16
5.551115123125783e-16
3.3306690738754696e-16
2.220446049250313e-16
1.1102230246251565e-16
0.0sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes128-element Vector{Float64}:
-1.1350517561031018e-43
-1.5183700854243774e-17
-3.0367825224961176e-17
-4.5550043771547255e-17
-6.073565044992235e-17
-7.591448086471941e-17
-9.110008754309451e-17
-1.0627891795789157e-16
-1.214306433183765e-16
-1.3660947373317356e-16
⋮
-3.3306690738754696e-16
-2.220446049250313e-16
-2.220446049250313e-16
-2.220446049250313e-16
-2.220446049250313e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
0.0As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.
Runtimes
Dense Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 5592 samples with 1 evaluation per sample.
Range (min … max): 759.413 μs … 12.050 ms ┊ GC (min … max): 0.00% … 91.93%
Time (median): 864.042 μs ┊ GC (median): 0.00%
Time (mean ± σ): 884.992 μs ± 348.348 μs ┊ GC (mean ± σ): 2.26% ± 5.16%
▂▆█▆▃▂▁▂▄▅▅▅▃▁ ▁▁
▂▂▂▂▃▂▃▃▃▃▄▄▃▃▃▃▃▃▃▃▃▄▅████████████████▆██▇██▇▆▄▄▄▃▂▃▂▂▂▂▂▂▁▂ ▄
759 μs Histogram: frequency by time 955 μs <
Memory estimate: 354.62 KiB, allocs estimate: 167.BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 1108 samples with 1 evaluation per sample.
Range (min … max): 4.076 ms … 44.063 ms ┊ GC (min … max): 0.00% … 20.57%
Time (median): 4.436 ms ┊ GC (median): 0.00%
Time (mean ± σ): 4.510 ms ± 1.212 ms ┊ GC (mean ± σ): 0.18% ± 0.62%
▅▅▂▆▆▇▇▅██▇▇▄█▃ ▃▂ ▂
▃▃▃▃▃▄▅▆▇███████████████████▇███▆▆▅▇▅▆▅▄▃▅▄▅▃▄▃▃▃▃▃▁▂▂▃▃▂▃ ▅
4.08 ms Histogram: frequency by time 5.09 ms <
Memory estimate: 351.95 KiB, allocs estimate: 276.Sparse Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 319.602 μs … 56.321 ms ┊ GC (min … max): 0.00% … 56.91%
Time (median): 359.046 μs ┊ GC (median): 0.00%
Time (mean ± σ): 421.117 μs ± 1.015 ms ┊ GC (mean ± σ): 10.37% ± 5.49%
▂▅▅▆█▇▇▇▆▅▄▄▃▁▁
▂▂▂▂▃▃▃▄▄▄▅▆▇████████████████▇▆▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂ ▄
320 μs Histogram: frequency by time 433 μs <
Memory estimate: 523.31 KiB, allocs estimate: 2672.BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 9663 samples with 1 evaluation per sample.
Range (min … max): 338.458 μs … 423.226 ms ┊ GC (min … max): 0.00% … 25.59%
Time (median): 418.965 μs ┊ GC (median): 0.00%
Time (mean ± σ): 513.219 μs ± 4.983 ms ┊ GC (mean ± σ): 3.36% ± 0.35%
▃▆▇█▇▄▂▁
▁▁▃▅█████████▇▆▅▅▅▅▆▅▅▅▅▄▄▄▅▄▅▄▅▅▅▅▅▅▅▅▄▄▄▃▃▄▄▄▄▄▅▄▄▄▄▄▄▃▃▂▂▂ ▄
338 μs Histogram: frequency by time 601 μs <
Memory estimate: 180.09 KiB, allocs estimate: 2224.