PETSc SNES Example 2
This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.
This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.
import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools
u0 = fill(0.5, 128)
function form_residual!(resid, x, _)
n = length(x)
xp = LinRange(0.0, 1.0, n)
F = 6xp .+ (xp .+ 1e-12) .^ 6
dx = 1 / (n - 1)
resid[1] = x[1]
for i in 2:(n - 1)
resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
end
resid[n] = x[n] - 1
return
endform_residual! (generic function with 1 method)To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.
nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())
nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
⋮
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.
sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes128-element Vector{Float64}:
-3.3683470657097324e-18
-2.4763537608571504e-17
-4.615821342767584e-17
-6.755426567531947e-17
-8.89520119888576e-17
-1.1034467610471221e-16
-1.3174411648414486e-16
-1.5311645180926536e-16
-1.7450233966154194e-16
-1.9591533256813065e-16
⋮
-6.661338147750939e-16
-6.661338147750939e-16
-5.551115123125783e-16
-4.440892098500626e-16
-3.3306690738754696e-16
-2.220446049250313e-16
-1.1102230246251565e-16
-1.1102230246251565e-16
0.0sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes128-element Vector{Float64}:
-2.0178697886277366e-42
-1.8943680107105614e-17
-3.7887783730684854e-17
-5.683252262897454e-17
-7.577895559315873e-17
-9.471861229376488e-17
-1.1366504525794907e-16
-1.325979256949772e-16
-1.5157146371347352e-16
-1.7049079162334557e-16
⋮
-9.992007221626409e-16
-7.771561172376096e-16
-6.661338147750939e-16
-5.551115123125783e-16
-4.440892098500626e-16
-3.3306690738754696e-16
-2.220446049250313e-16
-1.1102230246251565e-16
0.0As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.
Runtimes
Dense Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 5059 samples with 1 evaluation per sample.
Range (min … max): 893.697 μs … 16.744 ms ┊ GC (min … max): 0.00% … 94.03%
Time (median): 923.803 μs ┊ GC (median): 0.00%
Time (mean ± σ): 985.624 μs ± 417.325 μs ┊ GC (mean ± σ): 3.58% ± 7.54%
▃▆█▇▄▂▁ ▂▄▃▁ ▁
▇████████▇▅▆▆▆▅▄▄▃▄▄▅▄▁▃▁▅▁▃▃▃▄▁▁▅████▆▅▅▅▅▆▆▆▃▅▃▁▁▃▁▅▁▁▁▁▁▃▃ █
894 μs Histogram: log(frequency) by time 1.37 ms <
Memory estimate: 741.07 KiB, allocs estimate: 180.BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 1192 samples with 1 evaluation per sample.
Range (min … max): 4.013 ms … 5.138 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.186 ms ┊ GC (median): 0.00%
Time (mean ± σ): 4.190 ms ± 76.569 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▃▂▂▂▃▃▅█▄▆▄▃▁▂▅▁
▂▁▃▃▃▃▄▁▃▄▃▄▆▇▆██████████████████▆▆▅▃▅▃▃▃▃▂▃▃▃▂▃▂▁▂▁▃▁▂▁▁▂ ▄
4.01 ms Histogram: frequency by time 4.44 ms <
Memory estimate: 220.95 KiB, allocs estimate: 251.Sparse Jacobian
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 317.202 μs … 52.645 ms ┊ GC (min … max): 0.00% … 58.93%
Time (median): 342.028 μs ┊ GC (median): 0.00%
Time (mean ± σ): 395.591 μs ± 874.954 μs ┊ GC (mean ± σ): 8.50% ± 5.03%
▁▇██▆▃▁▁▁▂▂▂
▂▂▂▃▄▆█████████████▇▅▄▄▄▄▄▄▃▃▃▃▃▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▄
317 μs Histogram: frequency by time 440 μs <
Memory estimate: 532.73 KiB, allocs estimate: 2684.BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)BenchmarkTools.Trial: 9167 samples with 1 evaluation per sample.
Range (min … max): 311.330 μs … 238.807 ms ┊ GC (min … max): 0.00% … 22.31%
Time (median): 519.669 μs ┊ GC (median): 0.00%
Time (mean ± σ): 540.810 μs ± 2.492 ms ┊ GC (mean ± σ): 1.07% ± 0.23%
▁▁▄▄▄▄▂▃▃▃▄▆▄▇█▇▅▂ ▁▂ ▁▃▆▇▆▄▁▂▁▁ ▁▁
▅██████████████████▇▅▃▂▅▄▆▇█████████████████▇▇▇█▆▆▆▅██▇█▇▆▅▃▂ ▆
311 μs Histogram: frequency by time 761 μs <
Memory estimate: 176.98 KiB, allocs estimate: 2199.