PETSc SNES Example 2

This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.

This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.

import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools

u0 = fill(0.5, 128)

function form_residual!(resid, x, _)
    n = length(x)
    xp = LinRange(0.0, 1.0, n)
    F = 6xp .+ (xp .+ 1e-12) .^ 6

    dx = 1 / (n - 1)
    resid[1] = x[1]
    for i in 2:(n - 1)
        resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
    end
    resid[n] = x[n] - 1

    return
end
form_residual! (generic function with 1 method)

To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.

nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
    form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())

nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)
NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 ⋮
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.

sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes
128-element Vector{Float64}:
 -3.3683470657097324e-18
 -2.4763537608571504e-17
 -4.615821342767584e-17
 -6.755426567531947e-17
 -8.89520119888576e-17
 -1.1034467610471221e-16
 -1.3174411648414486e-16
 -1.5311645180926536e-16
 -1.7450233966154194e-16
 -1.9591533256813065e-16
  ⋮
 -6.661338147750939e-16
 -6.661338147750939e-16
 -5.551115123125783e-16
 -4.440892098500626e-16
 -3.3306690738754696e-16
 -2.220446049250313e-16
 -1.1102230246251565e-16
 -1.1102230246251565e-16
  0.0
sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes
128-element Vector{Float64}:
 -2.0178697886277366e-42
 -1.8943680107105614e-17
 -3.7887783730684854e-17
 -5.683252262897454e-17
 -7.577895559315873e-17
 -9.471861229376488e-17
 -1.1366504525794907e-16
 -1.325979256949772e-16
 -1.5157146371347352e-16
 -1.7049079162334557e-16
  ⋮
 -9.992007221626409e-16
 -7.771561172376096e-16
 -6.661338147750939e-16
 -5.551115123125783e-16
 -4.440892098500626e-16
 -3.3306690738754696e-16
 -2.220446049250313e-16
 -1.1102230246251565e-16
  0.0

As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.

Runtimes

Dense Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 5057 samples with 1 evaluation per sample.
 Range (minmax):  884.401 μs 17.611 ms   GC (min … max): 0.00% … 94.29%
 Time  (median):     916.510 μs                GC (median):    0.00%
 Time  (mean ± σ):   985.862 μs ± 473.808 μs   GC (mean ± σ):  4.22% ±  7.95%

     ▆█                                                         
  ▂▃▇██▆▄▃▃▂▂▂▂▂▂▁▂▁▂▂▂▂▁▂▁▁▁▂▂▂▁▁▂▁▁▂▃▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▂▁▂ ▃
  884 μs           Histogram: frequency by time         1.31 ms <

 Memory estimate: 741.07 KiB, allocs estimate: 180.
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 1237 samples with 1 evaluation per sample.
 Range (minmax):  3.867 ms 4.615 ms   GC (min … max): 0.00% … 0.00%
 Time  (median):     4.034 ms               GC (median):    0.00%
 Time  (mean ± σ):   4.039 ms ± 77.441 μs   GC (mean ± σ):  0.00% ± 0.00%

             ▂▃▆▇▆▅▂▃▅▄█▅▅▁▁▃▄▄▄▃                            
  ▂▂▃▄▃▄▅▅▅▅▇██████████████████████▇▄▄▄▃▂▂▃▃▃▂▂▂▂▂▂▁▂▁▂▁▁▂▂ ▅
  3.87 ms        Histogram: frequency by time        4.31 ms <

 Memory estimate: 220.95 KiB, allocs estimate: 251.

Sparse Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  316.971 μs 52.688 ms   GC (min … max): 0.00% … 60.01%
 Time  (median):     346.567 μs                GC (median):    0.00%
 Time  (mean ± σ):   402.478 μs ± 909.340 μs   GC (mean ± σ):  8.93% ±  5.14%

          ▂▅▇█▇▅▃▂▂▃▂▁                                          
  ▁▁▁▂▂▃▄▆█████████████▆▆▆▅▅▄▃▃▂▃▂▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  317 μs           Histogram: frequency by time          434 μs <

 Memory estimate: 532.73 KiB, allocs estimate: 2684.
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 9106 samples with 1 evaluation per sample.
 Range (minmax):  309.707 μs219.716 ms   GC (min … max): 0.00% … 23.97%
 Time  (median):     521.364 μs                GC (median):    0.00%
 Time  (mean ± σ):   544.468 μs ±   2.301 ms   GC (mean ± σ):  1.06% ±  0.25%

   ▂ ▂▃▁▃▂▁▃▃▅▅█▇▄▁       ▃▁▂▁▁▃▄▆▄▂▁ ▁ ▁▁                     
  ▅████████████████▅▃▂▂▄▇███████████████████▇▆▄▅▆▅▅▄▃▄▅▆▇▆▅▅▄ ▅
  310 μs           Histogram: frequency by time          785 μs <

 Memory estimate: 176.98 KiB, allocs estimate: 2199.