Linear System Solvers
LS.solve(prob::LS.LinearProblem,alg;kwargs)
Solves for $Au=b$ in the problem defined by prob
using the algorithm alg
. If no algorithm is given, a default algorithm will be chosen.
Recommended Methods
Dense Matrices
The default algorithm nothing
is good for picking an algorithm that will work, but one may need to change this to receive more performance or precision. If more precision is necessary, LS.QRFactorization()
and LS.SVDFactorization()
are the best choices, with SVD being the slowest but most precise.
For efficiency, RFLUFactorization
is the fastest for dense LU-factorizations until around 150x150 matrices, though this can be dependent on the exact details of the hardware. After this point, MKLLUFactorization
is usually faster on most hardware. Note that on Mac computers that AppleAccelerateLUFactorization
is generally always the fastest. LUFactorization
will use your base system BLAS which can be fast or slow depending on the hardware configuration. SimpleLUFactorization
will be fast only on very small matrices but can cut down on compile times.
For very large dense factorizations, offloading to the GPU can be preferred. Metal.jl can be used on Mac hardware to offload, and has a cutoff point of being faster at around size 20,000 x 20,000 matrices (and only supports Float32). CudaOffloadFactorization
can be more efficient at a much smaller cutoff, possibly around size 1,000 x 1,000 matrices, though this is highly dependent on the chosen GPU hardware. CudaOffloadFactorization
requires a CUDA-compatible NVIDIA GPU. CUDA offload supports Float64 but most consumer GPU hardware will be much faster on Float32 (many are >32x faster for Float32 operations than Float64 operations) and thus for most hardware this is only recommended for Float32 matrices.
Performance details for dense LU-factorizations can be highly dependent on the hardware configuration. For details see this issue. If one is looking to best optimize their system, we suggest running the performance tuning benchmark.
Sparse Matrices
For sparse LU-factorizations, KLUFactorization
if there is less structure to the sparsity pattern and UMFPACKFactorization
if there is more structure. Pardiso.jl's methods are also known to be very efficient sparse linear solvers.
While these sparse factorizations are based on implementations in other languages, and therefore constrained to standard number types (Float64
, Float32
and their complex counterparts), SparspakFactorization
is able to handle general number types, e.g. defined by ForwardDiff.jl
, MultiFloats.jl
, or IntervalArithmetics.jl
.
As sparse matrices get larger, iterative solvers tend to get more efficient than factorization methods if a lower tolerance of the solution is required.
Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The choice of Krylov method should be the one most constrained to the type of operator one has, for example if positive definite then Krylov_CG()
, but if no good properties then use Krylov_GMRES()
.
Finally, a user can pass a custom function for handling the linear solve using LS.LinearSolveFunction()
if existing solvers are not optimally suited for their application. The interface is detailed here.
Lazy SciMLOperators
If the linear operator is given as a lazy non-concrete operator, such as a FunctionOperator
, then using a Krylov method is preferred in order to not concretize the matrix. Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The choice of Krylov method should be the one most constrained to the type of operator one has, for example if positive definite then Krylov_CG()
, but if no good properties then use Krylov_GMRES()
.
If your materialized operator is a uniform block diagonal matrix, then you can use SimpleGMRES(; blocksize = <known block size>)
to further improve performance. This often shows up in Neural Networks where the Jacobian wrt the Inputs (almost always) is a Uniform Block Diagonal matrix of Block Size = size of the input divided by the batch size.
Full List of Methods
Polyalgorithms
LinearSolve.DefaultLinearSolver
— TypeDefaultLinearSolver(;safetyfallback=true)
The default linear solver. This is the algorithm chosen when solve(prob)
is called. It's a polyalgorithm that detects the optimal method for a given A, b
and hardware (Intel, AMD, GPU, etc.).
Keyword Arguments
safetyfallback
: determines whether to fallback to a column-pivoted QR factorization when an LU factorization fails. This can be required ifA
is rank-deficient. Defaults to true.
RecursiveFactorization.jl
Using this solver requires adding the package RecursiveFactorization.jl, i.e. using RecursiveFactorization
LinearSolve.RFLUFactorization
— TypeRFLUFactorization()
A fast pure Julia LU-factorization implementation using RecursiveFactorization.jl. This is by far the fastest LU-factorization implementation, usually outperforming OpenBLAS and MKL for smaller matrices (<500x500), but currently optimized only for Base Array
with Float32
or Float64
. Additional optimization for complex matrices is in the works.
Base.LinearAlgebra
These overloads tend to work for many array types, such as CuArrays
for GPU-accelerated solving, using the overloads provided by the respective packages. Given that this can be customized per-package, details given below describe a subset of important arrays (Matrix
, SparseMatrixCSC
, CuMatrix
, etc.)
LinearSolve.LUFactorization
— TypeLUFactorization(pivot=LinearAlgebra.RowMaximum())
Julia's built in lu
. Equivalent to calling lu!(A)
- On dense matrices, this uses the current BLAS implementation of the user's computer, which by default is OpenBLAS but will use MKL if the user does
using MKL
in their system. - On sparse matrices, this will use UMFPACK from SparseArrays. Note that this will not cache the symbolic factorization.
- On CuMatrix, it will use a CUDA-accelerated LU from CuSolver.
- On BandedMatrix and BlockBandedMatrix, it will use a banded LU.
Positional Arguments
- pivot: The choice of pivoting. Defaults to
LinearAlgebra.RowMaximum()
. The other choice isLinearAlgebra.NoPivot()
.
LinearSolve.GenericLUFactorization
— TypeGenericLUFactorization(pivot=LinearAlgebra.RowMaximum())
Julia's built in generic LU factorization. Equivalent to calling LinearAlgebra.generic_lufact!. Supports arbitrary number types but does not achieve as good scaling as BLAS-based LU implementations. Has low overhead and is good for small matrices.
Positional Arguments
- pivot: The choice of pivoting. Defaults to
LinearAlgebra.RowMaximum()
. The other choice isLinearAlgebra.NoPivot()
.
LinearSolve.QRFactorization
— TypeQRFactorization(pivot=LinearAlgebra.NoPivot(),blocksize=16)
Julia's built in qr
. Equivalent to calling qr!(A)
.
- On dense matrices, this uses the current BLAS implementation of the user's computer which by default is OpenBLAS but will use MKL if the user does
using MKL
in their system. - On sparse matrices, this will use SPQR from SparseArrays
- On CuMatrix, it will use a CUDA-accelerated QR from CuSolver.
- On BandedMatrix and BlockBandedMatrix, it will use a banded QR.
LinearSolve.SVDFactorization
— TypeSVDFactorization(full=false,alg=LinearAlgebra.DivideAndConquer())
Julia's built in svd
. Equivalent to svd!(A)
.
- On dense matrices, this uses the current BLAS implementation of the user's computer which by default is OpenBLAS but will use MKL if the user does
using MKL
in their system.
LinearSolve.CholeskyFactorization
— TypeCholeskyFactorization(; pivot = nothing, tol = 0.0, shift = 0.0, perm = nothing)
Julia's built in cholesky
. Equivalent to calling cholesky!(A)
.
Keyword Arguments
- pivot: defaluts to NoPivot, can also be RowMaximum.
- tol: the tol argument in CHOLMOD. Only used for sparse matrices.
- shift: the shift argument in CHOLMOD. Only used for sparse matrices.
- perm: the perm argument in CHOLMOD. Only used for sparse matrices.
LinearSolve.BunchKaufmanFactorization
— TypeBunchKaufmanFactorization(; rook = false)
Julia's built in bunchkaufman
. Equivalent to calling bunchkaufman(A)
. Only for Symmetric matrices.
Keyword Arguments
- rook: whether to perform rook pivoting. Defaults to false.
LinearSolve.CHOLMODFactorization
— TypeCHOLMODFactorization(; shift = 0.0, perm = nothing)
A wrapper of CHOLMOD's polyalgorithm, mixing Cholesky factorization and ldlt. Tries cholesky for performance and retries ldlt if conditioning causes Cholesky to fail.
Only supports sparse matrices.
Keyword Arguments
- shift: the shift argument in CHOLMOD.
- perm: the perm argument in CHOLMOD
LinearSolve.NormalCholeskyFactorization
— TypeNormalCholeskyFactorization(pivot = RowMaximum())
A fast factorization which uses a Cholesky factorization on A * A'. Can be much faster than LU factorization, but is not as numerically stable and thus should only be applied to well-conditioned matrices.
NormalCholeskyFactorization
should only be applied to well-conditioned matrices. As a method it is not able to easily identify possible numerical issues. As a check it is recommended that the user checks A*u-b
is approximately zero, as this may be untrue even if sol.retcode === ReturnCode.Success
due to numerical stability issues.
Positional Arguments
- pivot: Defaults to RowMaximum(), but can be NoPivot()
LinearSolve.NormalBunchKaufmanFactorization
— TypeNormalBunchKaufmanFactorization(rook = false)
A fast factorization which uses a BunchKaufman factorization on A * A'. Can be much faster than LU factorization, but is not as numerically stable and thus should only be applied to well-conditioned matrices.
Positional Arguments
- rook: whether to perform rook pivoting. Defaults to false.
LinearSolve.jl
LinearSolve.jl contains some linear solvers built in for specialized cases.
LinearSolve.SimpleLUFactorization
— TypeSimpleLUFactorization(pivot::Bool = true)
A simple LU-factorization implementation without BLAS. Fast for small matrices.
Positional Arguments
- pivot::Bool: whether to perform pivoting. Defaults to
true
LinearSolve.DiagonalFactorization
— TypeDiagonalFactorization()
A special implementation only for solving Diagonal
matrices fast.
LinearSolve.SimpleGMRES
— TypeSimpleGMRES(; restart::Bool = true, blocksize::Int = 0, warm_start::Bool = false,
memory::Int = 20)
A simple GMRES implementation for square non-Hermitian linear systems.
This implementation handles Block Diagonal Matrices with Uniformly Sized Square Blocks with specialized dispatches.
Arguments
restart::Bool
: Iftrue
, then the solver will restart aftermemory
iterations.memory::Int = 20
: The number of iterations before restarting. If restart is false, this value is used to allocate memory and later expanded if more memory is required.blocksize::Int = 0
: If blocksize is> 0
, the solver assumes that the matrix has a uniformly sized block diagonal structure with square blocks of sizeblocksize
. Misusing this option will lead to incorrect results.- If this is set
≤ 0
and during runtime we get a Block Diagonal Matrix, then we will check if the specialized dispatch can be used.
- If this is set
We can automatically detect if the matrix is a Block Diagonal Matrix with Uniformly Sized Square Blocks. If this is the case, then we can use a specialized dispatch. However, on most modern systems performing a single matrix-vector multiplication is faster than performing multiple smaller matrix-vector multiplications (as in the case of Block Diagonal Matrix). We recommend making the matrix dense (if size permits) and specifying the blocksize
argument.
FastLapackInterface.jl
FastLapackInterface.jl is a package that allows for a lower-level interface to the LAPACK calls to allow for preallocating workspaces to decrease the overhead of the wrappers. LinearSolve.jl provides a wrapper to these routines in a way where an initialized solver has a non-allocating LU factorization. In theory, this post-initialized solve should always be faster than the Base.LinearAlgebra version. In practice, with the way we wrap the solvers, we do not see a performance benefit and in fact benchmarks tend to show this inhibits performance.
Using this solver requires adding the package FastLapackInterface.jl, i.e. using FastLapackInterface
LinearSolve.FastLUFactorization
— TypeFastLUFactorization()
The FastLapackInterface.jl version of the LU factorization. Notably, this version does not allow for choice of pivoting method.
LinearSolve.FastQRFactorization
— TypeFastQRFactorization()
The FastLapackInterface.jl version of the QR factorization.
SuiteSparse.jl
LinearSolve.KLUFactorization
— TypeKLUFactorization(;reuse_symbolic=true, check_pattern=true)
A fast sparse LU-factorization which specializes on sparsity patterns with “less structure”.
By default, the SparseArrays.jl are implemented for efficiency by caching the symbolic factorization. I.e., if set_A
is used, it is expected that the new A
has the same sparsity pattern as the previous A
. If this algorithm is to be used in a context where that assumption does not hold, set reuse_symbolic=false
.
LinearSolve.UMFPACKFactorization
— TypeUMFPACKFactorization(;reuse_symbolic=true, check_pattern=true)
A fast sparse multithreaded LU-factorization which specializes on sparsity patterns with “more structure”.
By default, the SparseArrays.jl are implemented for efficiency by caching the symbolic factorization. I.e., if set_A
is used, it is expected that the new A
has the same sparsity pattern as the previous A
. If this algorithm is to be used in a context where that assumption does not hold, set reuse_symbolic=false
.
Sparspak.jl
LinearSolve.SparspakFactorization
— TypeSparspakFactorization(reuse_symbolic = true)
This is the translation of the well-known sparse matrix software Sparspak (Waterloo Sparse Matrix Package), solving large sparse systems of linear algebraic equations. Sparspak is composed of the subroutines from the book "Computer Solution of Large Sparse Positive Definite Systems" by Alan George and Joseph Liu. Originally written in Fortran 77, later rewritten in Fortran 90. Here is the software translated into Julia.
The Julia rewrite is released under the MIT license with an express permission from the authors of the Fortran package. The package uses multiple dispatch to route around standard BLAS routines in the case e.g. of arbitrary-precision floating point numbers or ForwardDiff.Dual. This e.g. allows for Automatic Differentiation (AD) of a sparse-matrix solve.
Krylov.jl
LinearSolve.KrylovJL_CG
— FunctionKrylovJL_CG(args...; kwargs...)
A generic CG implementation for Hermitian and positive definite linear systems
LinearSolve.KrylovJL_MINRES
— FunctionKrylovJL_MINRES(args...; kwargs...)
A generic MINRES implementation for Hermitian linear systems
LinearSolve.KrylovJL_GMRES
— FunctionKrylovJL_GMRES(args...; gmres_restart = 0, window = 0, kwargs...)
A generic GMRES implementation for square non-Hermitian linear systems
LinearSolve.KrylovJL_BICGSTAB
— FunctionKrylovJL_BICGSTAB(args...; kwargs...)
A generic BICGSTAB implementation for square non-Hermitian linear systems
LinearSolve.KrylovJL_LSMR
— FunctionKrylovJL_LSMR(args...; kwargs...)
A generic LSMR implementation for least-squares problems
LinearSolve.KrylovJL_CRAIGMR
— FunctionKrylovJL_CRAIGMR(args...; kwargs...)
A generic CRAIGMR implementation for least-norm problems
LinearSolve.KrylovJL
— TypeKrylovJL(args...; KrylovAlg = Krylov.gmres!,
Pl = nothing, Pr = nothing,
gmres_restart = 0, window = 0,
kwargs...)
A generic wrapper over the Krylov.jl krylov-subspace iterative solvers.
MKL.jl
LinearSolve.MKLLUFactorization
— TypeMKLLUFactorization()
A wrapper over Intel's Math Kernel Library (MKL). Direct calls to MKL in a way that pre-allocates workspace to avoid allocations and does not require libblastrampoline.
AppleAccelerate.jl
Using this solver requires a Mac with Apple Accelerate. This should come standard in most "modern" Mac computers.
LinearSolve.AppleAccelerateLUFactorization
— TypeAppleAccelerateLUFactorization()
A wrapper over Apple's Accelerate Library. Direct calls to Acceelrate in a way that pre-allocates workspace to avoid allocations and does not require libblastrampoline.
Metal.jl
Using this solver requires adding the package Metal.jl, i.e. using Metal
. This package is only compatible with Mac M-Series computers with a Metal-compatible GPU.
LinearSolve.MetalLUFactorization
— TypeMetalLUFactorization()
A wrapper over Apple's Metal GPU library. Direct calls to Metal in a way that pre-allocates workspace to avoid allocations and automatically offloads to the GPU.
Pardiso.jl
LinearSolve.MKLPardisoFactorize
— FunctionMKLPardisoFactorize(; nprocs::Union{Int, Nothing} = nothing,
matrix_type = nothing,
cache_analysis = false,
iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing)
A sparse factorization method using MKL Pardiso.
Keyword Arguments
Setting cache_analysis = true
disables Pardiso's scaling and matching defaults and caches the result of the initial analysis phase for all further computations with this solver.
For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing
and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.
LinearSolve.MKLPardisoIterate
— FunctionMKLPardisoIterate(; nprocs::Union{Int, Nothing} = nothing,
matrix_type = nothing,
cache_analysis = false,
iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing)
A mixed factorization+iterative method using MKL Pardiso.
Keyword Arguments
Setting cache_analysis = true
disables Pardiso's scaling and matching defaults and caches the result of the initial analysis phase for all further computations with this solver.
For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing
and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.
LinearSolve.PardisoJL
— TypePardisoJL(; nprocs::Union{Int, Nothing} = nothing,
solver_type = nothing,
matrix_type = nothing,
iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
vendor::Union{Symbol, Nothing} = nothing
)
A generic method using Pardiso. Specifying solver_type
is required.
Keyword Arguments
The vendor
keyword allows to choose between Panua pardiso (former pardiso-project.org; vendor=:Panua
) and MKL Pardiso (vendor=:MKL
). If vendor==nothing
, Panua pardiso is preferred over MKL Pardiso.
For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing
and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.
CUDA.jl
Note that CuArrays
are supported by GenericFactorization
in the “normal” way. The following are non-standard GPU factorization routines.
LinearSolve.CudaOffloadFactorization
— TypeCudaOffloadFactorization()
An offloading technique used to GPU-accelerate CPU-based computations. Requires a sufficiently large A
to overcome the data transfer costs.
IterativeSolvers.jl
Using these solvers requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers
LinearSolve.IterativeSolversJL_CG
— FunctionIterativeSolversJL_CG(args...; Pl = nothing, Pr = nothing, kwargs...)
A wrapper over the IterativeSolvers.jl CG.
LinearSolve.IterativeSolversJL_GMRES
— FunctionIterativeSolversJL_GMRES(args...; Pl = nothing, Pr = nothing, gmres_restart = 0, kwargs...)
A wrapper over the IterativeSolvers.jl GMRES.
LinearSolve.IterativeSolversJL_BICGSTAB
— FunctionIterativeSolversJL_BICGSTAB(args...; Pl = nothing, Pr = nothing, kwargs...)
A wrapper over the IterativeSolvers.jl BICGSTAB.
LinearSolve.IterativeSolversJL_MINRES
— FunctionIterativeSolversJL_MINRES(args...; Pl = nothing, Pr = nothing, kwargs...)
A wrapper over the IterativeSolvers.jl MINRES.
LinearSolve.IterativeSolversJL_IDRS
— FunctionIterativeSolversJL_IDRS(args...; Pl = nothing, kwargs...)
A wrapper over the IterativeSolvers.jl IDR(S).
LinearSolve.IterativeSolversJL
— TypeIterativeSolversJL(args...;
generate_iterator = IterativeSolvers.gmres_iterable!,
Pl = nothing, Pr = nothing,
gmres_restart = 0, kwargs...)
A generic wrapper over the IterativeSolvers.jl solvers.
KrylovKit.jl
LinearSolve.KrylovKitJL_CG
— FunctionKrylovKitJL_CG(args...; Pl = nothing, Pr = nothing, kwargs...)
A generic CG implementation for Hermitian and positive definite linear systems
LinearSolve.KrylovKitJL_GMRES
— FunctionKrylovKitJL_GMRES(args...; Pl = nothing, Pr = nothing, gmres_restart = 0, kwargs...)
A generic GMRES implementation.
LinearSolve.KrylovKitJL
— TypeKrylovKitJL(args...; KrylovAlg = Krylov.gmres!, kwargs...)
A generic iterative solver implementation allowing the choice of KrylovKit.jl solvers.
HYPRE.jl
Using HYPRE solvers requires Julia version 1.9 or higher, and that the package HYPRE.jl is installed.
LinearSolve.HYPREAlgorithm
— TypeHYPREAlgorithm(solver; Pl = nothing)
HYPRE.jl is an interface to hypre
and provide iterative solvers and preconditioners for sparse linear systems. It is mainly developed for large multi-process distributed problems (using MPI), but can also be used for single-process problems with Julias standard sparse matrices.
If you need more fine-grained control over the solver/preconditioner options you can alternatively pass an already created solver to HYPREAlgorithm
(and to the Pl
keyword argument). See HYPRE.jl docs for how to set up solvers with specific options.
Using HYPRE solvers requires Julia version 1.9 or higher, and that the package HYPRE.jl is installed.
Positional Arguments
The single positional argument solver
has the following choices:
HYPRE.BiCGSTAB
HYPRE.BoomerAMG
HYPRE.FlexGMRES
HYPRE.GMRES
HYPRE.Hybrid
HYPRE.ILU
HYPRE.ParaSails
(as preconditioner only)HYPRE.PCG
Keyword Arguments
Pl
: A choice of left preconditioner.
Example
For example, to use HYPRE.PCG
as the solver, with HYPRE.BoomerAMG
as the preconditioner, the algorithm should be defined as follows:
A, b = setup_system(...)
prob = LinearProblem(A, b)
alg = HYPREAlgorithm(HYPRE.PCG)
prec = HYPRE.BoomerAMG
sol = solve(prob, alg; Pl = prec)