PETSc SNES Example 2

This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.

This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.

import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools

u0 = fill(0.5, 128)

function form_residual!(resid, x, _)
    n = length(x)
    xp = LinRange(0.0, 1.0, n)
    F = 6xp .+ (xp .+ 1e-12) .^ 6

    dx = 1 / (n - 1)
    resid[1] = x[1]
    for i in 2:(n - 1)
        resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
    end
    resid[n] = x[n] - 1

    return
end
form_residual! (generic function with 1 method)

To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.

nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
    form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())

nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)
NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 ⋮
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.

sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes
128-element Vector{Float64}:
  4.1386421557519005e-19
 -2.1053427420479262e-17
 -4.252105395216588e-17
 -6.398825696737886e-17
 -8.545545998259185e-17
 -1.0692266299780484e-16
 -1.2839664227659586e-16
 -1.4986384529180885e-16
 -1.713039432527097e-16
 -1.927711462679227e-16
  ⋮
 -6.661338147750939e-16
 -6.661338147750939e-16
 -5.551115123125783e-16
 -4.440892098500626e-16
 -3.3306690738754696e-16
 -2.220446049250313e-16
 -1.1102230246251565e-16
 -1.1102230246251565e-16
  0.0
sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes
128-element Vector{Float64}:
 -1.9281866869109483e-42
 -6.886907256769496e-18
 -1.3773602755302178e-17
 -2.0660827649426894e-17
 -2.755228770828788e-17
 -3.4436971503570835e-17
 -4.132165529885379e-17
 -4.819278656698067e-17
 -5.51316804708879e-17
 -6.196215415754658e-17
  ⋮
 -5.551115123125783e-16
 -4.440892098500626e-16
 -4.440892098500626e-16
 -3.3306690738754696e-16
 -2.220446049250313e-16
 -2.220446049250313e-16
 -1.1102230246251565e-16
  0.0
  0.0

As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.

Runtimes

Dense Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 5180 samples with 1 evaluation per sample.
 Range (minmax):  866.912 μs  2.718 ms   GC (min … max): 0.00% … 56.56%
 Time  (median):     908.490 μs                GC (median):    0.00%
 Time  (mean ± σ):   961.776 μs ± 145.369 μs   GC (mean ± σ):  0.94% ±  4.57%

      ▂▄▇█                                                      
  ▁▂▄▇████▇▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▄▆▇▆▆▄▃▂▂▂▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁ ▂
  867 μs           Histogram: frequency by time         1.17 ms <

 Memory estimate: 739.89 KiB, allocs estimate: 209.
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 1058 samples with 1 evaluation per sample.
 Range (minmax):  4.545 ms 30.635 ms   GC (min … max): 0.00% … 7.94%
 Time  (median):     4.667 ms                GC (median):    0.00%
 Time  (mean ± σ):   4.725 ms ± 819.867 μs   GC (mean ± σ):  0.05% ± 0.24%

    ▂▄▄▅▆▆▃█▃▁        ▃▃▆▁                                     
  ▃▅███████████▅▆▇█▇█████▇▆█▆▅▄▃▂▃▃▃▂▂▂▃▃▂▂▂▃▂▃▁▂▃▁▂▂▁▂▁▁▁▂ ▄
  4.55 ms         Histogram: frequency by time         5.1 ms <

 Memory estimate: 219.96 KiB, allocs estimate: 280.

Sparse Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  409.471 μs  8.487 ms   GC (min … max): 0.00% … 42.57%
 Time  (median):     429.859 μs                GC (median):    0.00%
 Time  (mean ± σ):   462.912 μs ± 463.776 μs   GC (mean ± σ):  2.95% ±  2.77%

       ▂▅██▅▂   ▃▄▅▄                                             
  ▂▂▂▃▅█████████████▇▄▄▃▃▃▃▃▂▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂▂▂▂▂ ▄
  409 μs           Histogram: frequency by time          512 μs <

 Memory estimate: 540.27 KiB, allocs estimate: 1963.
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 9992 samples with 1 evaluation per sample.
 Range (minmax):  392.600 μs54.128 ms   GC (min … max): 0.00% … 16.30%
 Time  (median):     426.047 μs               GC (median):    0.00%
 Time  (mean ± σ):   496.275 μs ±  1.796 ms   GC (mean ± σ):  2.19% ±  0.60%

          ▄▆█▇█▇▆▆▅▃▄▃▂                                        
  ▂▂▃▃▄▄▅███████████████▆▅▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  393 μs          Histogram: frequency by time          526 μs <

 Memory estimate: 179.79 KiB, allocs estimate: 1496.