PETSc SNES Example 2

This implements src/snes/examples/tutorials/ex2.c from PETSc and examples/SNES_ex2.jl from PETSc.jl using automatic sparsity detection and automatic differentiation using NonlinearSolve.jl.

This solves the equations sequentially. Newton method to solve u'' + u^{2} = f, sequentially.

import NonlinearSolve as NLS
import PETSc
import LinearAlgebra
import SparseConnectivityTracer
import BenchmarkTools

u0 = fill(0.5, 128)

function form_residual!(resid, x, _)
    n = length(x)
    xp = LinRange(0.0, 1.0, n)
    F = 6xp .+ (xp .+ 1e-12) .^ 6

    dx = 1 / (n - 1)
    resid[1] = x[1]
    for i in 2:(n - 1)
        resid[i] = (x[i - 1] - 2x[i] + x[i + 1]) / dx^2 + x[i] * x[i] - F[i]
    end
    resid[n] = x[n] - 1

    return
end
form_residual! (generic function with 1 method)

To use automatic sparsity detection, we need to specify sparsity keyword argument to NonlinearFunction. See Automatic Sparsity Detection for more details.

nlfunc_dense = NLS.NonlinearFunction(form_residual!)
nlfunc_sparse = NLS.NonlinearFunction(
    form_residual!; sparsity = SparseConnectivityTracer.TracerSparsityDetector())

nlprob_dense = NLS.NonlinearProblem(nlfunc_dense, u0)
nlprob_sparse = NLS.NonlinearProblem(nlfunc_sparse, u0)
NonlinearProblem with uType Vector{Float64}. In-place: true
u0: 128-element Vector{Float64}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 ⋮
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

Now we can solve the problem using PETScSNES or with one of the native NonlinearSolve.jl solvers.

sol_dense_nr = NLS.solve(nlprob_dense, NLS.NewtonRaphson(); abstol = 1e-8)
sol_dense_snes = NLS.solve(nlprob_dense, NLS.PETScSNES(); abstol = 1e-8)
sol_dense_nr .- sol_dense_snes
128-element Vector{Float64}:
 -5.522508045665029e-19
  5.589962171669233e-17
  1.123521441897049e-16
  1.68803501992415e-16
  2.252565538610196e-16
  2.817096057296242e-16
  3.3816265759822883e-16
  3.946224857304115e-16
  4.510823138625941e-16
  5.075150369404646e-16
  ⋮
  9.992007221626409e-16
  8.881784197001252e-16
  7.771561172376096e-16
  6.661338147750939e-16
  5.551115123125783e-16
  3.3306690738754696e-16
  2.220446049250313e-16
  1.1102230246251565e-16
  0.0
sol_sparse_nr = NLS.solve(nlprob_sparse, NLS.NewtonRaphson(); abstol = 1e-8)
sol_sparse_snes = NLS.solve(nlprob_sparse, NLS.PETScSNES(); abstol = 1e-8)
sol_sparse_nr .- sol_sparse_snes
128-element Vector{Float64}:
 -1.1350517561031018e-43
 -1.5183700854243774e-17
 -3.0367825224961176e-17
 -4.5550043771547255e-17
 -6.073565044992235e-17
 -7.591448086471941e-17
 -9.110008754309451e-17
 -1.0627891795789157e-16
 -1.214306433183765e-16
 -1.3660947373317356e-16
  ⋮
 -3.3306690738754696e-16
 -2.220446049250313e-16
 -2.220446049250313e-16
 -2.220446049250313e-16
 -2.220446049250313e-16
 -1.1102230246251565e-16
 -1.1102230246251565e-16
 -1.1102230246251565e-16
  0.0

As expected the solutions are the same (upto floating point error). Now let's compare the runtimes.

Runtimes

Dense Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 5592 samples with 1 evaluation per sample.
 Range (minmax):  759.413 μs 12.050 ms   GC (min … max): 0.00% … 91.93%
 Time  (median):     864.042 μs                GC (median):    0.00%
 Time  (mean ± σ):   884.992 μs ± 348.348 μs   GC (mean ± σ):  2.26% ±  5.16%

                          ▂▆█▆▃▂▁▂▄▅▅▃▁     ▁▁                  
  ▂▂▂▂▃▂▃▃▃▃▄▄▃▃▃▃▃▃▃▃▃▄▅█████████████████▇██▇▆▄▄▄▃▂▃▂▂▂▂▂▂▁▂ ▄
  759 μs           Histogram: frequency by time          955 μs <

 Memory estimate: 354.62 KiB, allocs estimate: 167.
BenchmarkTools.@benchmark NLS.solve($(nlprob_dense), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 1108 samples with 1 evaluation per sample.
 Range (minmax):  4.076 ms44.063 ms   GC (min … max): 0.00% … 20.57%
 Time  (median):     4.436 ms               GC (median):    0.00%
 Time  (mean ± σ):   4.510 ms ±  1.212 ms   GC (mean ± σ):  0.18% ±  0.62%

            ▅▅▂▆▆▇▇▅██▇▄█▃ ▃▂  ▂                             
  ▃▃▃▃▃▄▅▆▇█████████████████▇███▆▆▅▇▅▆▅▄▃▅▄▅▃▄▃▃▃▃▃▁▂▂▃▃▂▃ ▅
  4.08 ms        Histogram: frequency by time        5.09 ms <

 Memory estimate: 351.95 KiB, allocs estimate: 276.

Sparse Jacobian

BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.NewtonRaphson()); abstol = 1e-8)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (minmax):  319.602 μs56.321 ms   GC (min … max):  0.00% … 56.91%
 Time  (median):     359.046 μs               GC (median):     0.00%
 Time  (mean ± σ):   421.117 μs ±  1.015 ms   GC (mean ± σ):  10.37% ±  5.49%

                ▂▅▅▆█▇▇▆▅▄▄▃▁▁                                 
  ▂▂▂▂▃▃▃▄▄▄▅▆▇███████████████▇▆▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂ ▄
  320 μs          Histogram: frequency by time          433 μs <

 Memory estimate: 523.31 KiB, allocs estimate: 2672.
BenchmarkTools.@benchmark NLS.solve($(nlprob_sparse), $(NLS.PETScSNES()); abstol = 1e-8)
BenchmarkTools.Trial: 9663 samples with 1 evaluation per sample.
 Range (minmax):  338.458 μs423.226 ms   GC (min … max): 0.00% … 25.59%
 Time  (median):     418.965 μs                GC (median):    0.00%
 Time  (mean ± σ):   513.219 μs ±   4.983 ms   GC (mean ± σ):  3.36% ±  0.35%

       ▃▆▇█▇▄▂▁                                                  
  ▁▁▃▅█████████▇▆▅▅▅▅▅▅▅▅▄▄▄▅▄▅▄▅▅▅▅▅▅▅▅▄▄▃▃▄▄▄▄▄▅▄▄▄▄▄▄▃▃▂▂▂ ▄
  338 μs           Histogram: frequency by time          601 μs <

 Memory estimate: 180.09 KiB, allocs estimate: 2224.