Linear System Solvers

LS.solve(prob::LS.LinearProblem,alg;kwargs)

Solves for $Au=b$ in the problem defined by prob using the algorithm alg. If no algorithm is given, a default algorithm will be chosen.

Dense Matrices

The default algorithm nothing is good for picking an algorithm that will work, but one may need to change this to receive more performance or precision. If more precision is necessary, LS.QRFactorization() and LS.SVDFactorization() are the best choices, with SVD being the slowest but most precise.

For efficiency, RFLUFactorization is the fastest for dense LU-factorizations until around 150x150 matrices, though this can be dependent on the exact details of the hardware. After this point, MKLLUFactorization is usually faster on most hardware. Note that on Mac computers that AppleAccelerateLUFactorization is generally always the fastest. OpenBLASLUFactorization provides direct OpenBLAS calls without going through libblastrampoline and can be faster than LUFactorization in some configurations. LUFactorization will use your base system BLAS which can be fast or slow depending on the hardware configuration. SimpleLUFactorization will be fast only on very small matrices but can cut down on compile times.

For very large dense factorizations, offloading to the GPU can be preferred. Metal.jl can be used on Mac hardware to offload, and has a cutoff point of being faster at around size 20,000 x 20,000 matrices (and only supports Float32). CudaOffloadLUFactorization and CudaOffloadQRFactorization can be more efficient at a much smaller cutoff, possibly around size 1,000 x 1,000 matrices, though this is highly dependent on the chosen GPU hardware. These algorithms require a CUDA-compatible NVIDIA GPU. CUDA offload supports Float64 but most consumer GPU hardware will be much faster on Float32 (many are >32x faster for Float32 operations than Float64 operations) and thus for most hardware this is only recommended for Float32 matrices. Choose CudaOffloadLUFactorization for better performance on well-conditioned problems, or CudaOffloadQRFactorization for better numerical stability on ill-conditioned problems.

Mixed Precision Methods

For large well-conditioned problems where memory bandwidth is the bottleneck, mixed precision methods can provide significant speedups (up to 2x) by performing the factorization in Float32 while maintaining Float64 interfaces. These methods are particularly effective for:

  • Large dense matrices (> 1000x1000)
  • Well-conditioned problems (condition number < 10^4)
  • Hardware with good Float32 performance

Available mixed precision solvers:

  • MKL32MixedLUFactorization - CPUs with MKL
  • AppleAccelerate32MixedLUFactorization - Apple CPUs with Accelerate
  • CUDAOffload32MixedLUFactorization - NVIDIA GPUs with CUDA
  • MetalOffload32MixedLUFactorization - Apple GPUs with Metal

These methods automatically handle the precision conversion, making them easy drop-in replacements when reduced precision is acceptable for the factorization step.

Note

Performance details for dense LU-factorizations can be highly dependent on the hardware configuration. For details see this issue. If one is looking to best optimize their system, we suggest running the performance tuning benchmark.

Sparse Matrices

For sparse LU-factorizations, KLUFactorization if there is less structure to the sparsity pattern and UMFPACKFactorization if there is more structure. Pardiso.jl's methods are also known to be very efficient sparse linear solvers.

For GPU-accelerated sparse LU-factorizations, there are two high-performance options. When using CuSparseMatrixCSR arrays with CUDSS.jl loaded, LUFactorization() will automatically use NVIDIA's cuDSS library. Alternatively, CUSOLVERRFFactorization provides access to NVIDIA's cusolverRF library. Both offer significant performance improvements for sparse systems on CUDA-capable GPUs and are particularly effective for large sparse matrices that can benefit from GPU parallelization. CUDSS is more for Float32 while CUSOLVERRFFactorization is for Float64.

While these sparse factorizations are based on implementations in other languages, and therefore constrained to standard number types (Float64, Float32 and their complex counterparts), SparspakFactorization is able to handle general number types, e.g. defined by ForwardDiff.jl, MultiFloats.jl, or IntervalArithmetics.jl.

As sparse matrices get larger, iterative solvers tend to get more efficient than factorization methods if a lower tolerance of the solution is required.

Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The choice of Krylov method should be the one most constrained to the type of operator one has, for example if positive definite then Krylov_CG(), but if no good properties then use Krylov_GMRES().

Finally, a user can pass a custom function for handling the linear solve using LS.LinearSolveFunction() if existing solvers are not optimally suited for their application. The interface is detailed here.

Lazy SciMLOperators

If the linear operator is given as a lazy non-concrete operator, such as a FunctionOperator, then using a Krylov method is preferred in order to not concretize the matrix. Krylov.jl generally outperforms IterativeSolvers.jl and KrylovKit.jl, and is compatible with CPUs and GPUs, and thus is the generally preferred form for Krylov methods. The choice of Krylov method should be the one most constrained to the type of operator one has, for example if positive definite then Krylov_CG(), but if no good properties then use Krylov_GMRES().

Tip

If your materialized operator is a uniform block diagonal matrix, then you can use SimpleGMRES(; blocksize = <known block size>) to further improve performance. This often shows up in Neural Networks where the Jacobian wrt the Inputs (almost always) is a Uniform Block Diagonal matrix of Block Size = size of the input divided by the batch size.

Full List of Methods

Polyalgorithms

LinearSolve.DefaultLinearSolverType
DefaultLinearSolver(;safetyfallback=true)

The default linear solver. This is the algorithm chosen when solve(prob) is called. It's a polyalgorithm that detects the optimal method for a given A, b and hardware (Intel, AMD, GPU, etc.).

Keyword Arguments

  • safetyfallback: determines whether to fallback to a column-pivoted QR factorization when an LU factorization fails. This can be required if A is rank-deficient. Defaults to true.
source

RecursiveFactorization.jl

Note

Using this solver requires adding the package RecursiveFactorization.jl, i.e. using RecursiveFactorization

LinearSolve.RFLUFactorizationType
RFLUFactorization{P, T}(; pivot = Val(true), thread = Val(true))

A fast pure Julia LU-factorization implementation using RecursiveFactorization.jl. This is by far the fastest LU-factorization implementation, usually outperforming OpenBLAS and MKL for smaller matrices (<500x500), but currently optimized only for Base Array with Float32 or Float64. Additional optimization for complex matrices is in the works.

Type Parameters

  • P: Pivoting strategy as Val{Bool}. Val{true} enables partial pivoting for stability.
  • T: Threading strategy as Val{Bool}. Val{true} enables multi-threading for performance.

Constructor Arguments

  • pivot = Val(true): Enable partial pivoting. Set to Val{false} to disable for speed at the cost of numerical stability.
  • thread = Val(true): Enable multi-threading. Set to Val{false} for single-threaded execution.
  • throwerror = true: Whether to throw an error if RecursiveFactorization.jl is not loaded.

Performance Notes

  • Fastest for dense matrices with dimensions roughly < 500×500
  • Optimized specifically for Float32 and Float64 element types
  • Recursive blocking strategy provides excellent cache performance
  • Multi-threading can provide significant speedups on multi-core systems

Requirements

Using this solver requires that RecursiveFactorization.jl is loaded: using RecursiveFactorization

Example

using RecursiveFactorization
# Fast, stable (with pivoting)
alg1 = RFLUFactorization()
# Fastest (no pivoting), less stable
alg2 = RFLUFactorization(pivot=Val(false))  
source

Base.LinearAlgebra

These overloads tend to work for many array types, such as CuArrays for GPU-accelerated solving, using the overloads provided by the respective packages. Given that this can be customized per-package, details given below describe a subset of important arrays (Matrix, SparseMatrixCSC, CuMatrix, etc.)

LinearSolve.LUFactorizationType

LUFactorization(pivot=LinearAlgebra.RowMaximum())

Julia's built in lu. Equivalent to calling lu!(A)

  • On dense matrices, this uses the current BLAS implementation of the user's computer, which by default is OpenBLAS but will use MKL if the user does using MKL in their system.
  • On sparse matrices, this will use UMFPACK from SparseArrays. Note that this will not cache the symbolic factorization.
  • On CuMatrix, it will use a CUDA-accelerated LU from CuSolver.
  • On BandedMatrix and BlockBandedMatrix, it will use a banded LU.

Positional Arguments

  • pivot: The choice of pivoting. Defaults to LinearAlgebra.RowMaximum(). The other choice is LinearAlgebra.NoPivot().
source
LinearSolve.GenericLUFactorizationType

GenericLUFactorization(pivot=LinearAlgebra.RowMaximum())

Julia's built in generic LU factorization. Equivalent to calling LinearAlgebra.generic_lufact!. Supports arbitrary number types but does not achieve as good scaling as BLAS-based LU implementations. Has low overhead and is good for small matrices.

Positional Arguments

  • pivot: The choice of pivoting. Defaults to LinearAlgebra.RowMaximum(). The other choice is LinearAlgebra.NoPivot().
source
LinearSolve.QRFactorizationType

QRFactorization(pivot=LinearAlgebra.NoPivot(),blocksize=16)

Julia's built in qr. Equivalent to calling qr!(A).

  • On dense matrices, this uses the current BLAS implementation of the user's computer which by default is OpenBLAS but will use MKL if the user does using MKL in their system.
  • On sparse matrices, this will use SPQR from SparseArrays
  • On CuMatrix, it will use a CUDA-accelerated QR from CuSolver.
  • On BandedMatrix and BlockBandedMatrix, it will use a banded QR.
source
LinearSolve.SVDFactorizationType

SVDFactorization(full=false,alg=LinearAlgebra.DivideAndConquer())

Julia's built in svd. Equivalent to svd!(A).

  • On dense matrices, this uses the current BLAS implementation of the user's computer which by default is OpenBLAS but will use MKL if the user does using MKL in their system.
source
LinearSolve.CholeskyFactorizationType

CholeskyFactorization(; pivot = nothing, tol = 0.0, shift = 0.0, perm = nothing)

Julia's built in cholesky. Equivalent to calling cholesky!(A).

Keyword Arguments

  • pivot: defaluts to NoPivot, can also be RowMaximum.
  • tol: the tol argument in CHOLMOD. Only used for sparse matrices.
  • shift: the shift argument in CHOLMOD. Only used for sparse matrices.
  • perm: the perm argument in CHOLMOD. Only used for sparse matrices.
source
LinearSolve.BunchKaufmanFactorizationType

BunchKaufmanFactorization(; rook = false)

Julia's built in bunchkaufman. Equivalent to calling bunchkaufman(A). Only for Symmetric matrices.

Keyword Arguments

  • rook: whether to perform rook pivoting. Defaults to false.
source
LinearSolve.CHOLMODFactorizationType

CHOLMODFactorization(; shift = 0.0, perm = nothing)

A wrapper of CHOLMOD's polyalgorithm, mixing Cholesky factorization and ldlt. Tries cholesky for performance and retries ldlt if conditioning causes Cholesky to fail.

Only supports sparse matrices.

Keyword Arguments

  • shift: the shift argument in CHOLMOD.
  • perm: the perm argument in CHOLMOD
source
LinearSolve.NormalCholeskyFactorizationType

NormalCholeskyFactorization(pivot = RowMaximum())

A fast factorization which uses a Cholesky factorization on A * A'. Can be much faster than LU factorization, but is not as numerically stable and thus should only be applied to well-conditioned matrices.

Warn

NormalCholeskyFactorization should only be applied to well-conditioned matrices. As a method it is not able to easily identify possible numerical issues. As a check it is recommended that the user checks A*u-b is approximately zero, as this may be untrue even if sol.retcode === ReturnCode.Success due to numerical stability issues.

Positional Arguments

  • pivot: Defaults to RowMaximum(), but can be NoPivot()
source
LinearSolve.NormalBunchKaufmanFactorizationType

NormalBunchKaufmanFactorization(rook = false)

A fast factorization which uses a BunchKaufman factorization on A * A'. Can be much faster than LU factorization, but is not as numerically stable and thus should only be applied to well-conditioned matrices.

Positional Arguments

  • rook: whether to perform rook pivoting. Defaults to false.
source

LinearSolve.jl

LinearSolve.jl contains some linear solvers built in for specialized cases.

Missing docstring.

Missing docstring for SimpleLUFactorization. Check Documenter's build log for details.

LinearSolve.SimpleGMRESType
SimpleGMRES(; restart::Bool = true, blocksize::Int = 0, warm_start::Bool = false,
    memory::Int = 20)

A simple GMRES implementation for square non-Hermitian linear systems.

This implementation handles Block Diagonal Matrices with Uniformly Sized Square Blocks with specialized dispatches.

Arguments

  • restart::Bool: If true, then the solver will restart after memory iterations.

  • memory::Int = 20: The number of iterations before restarting. If restart is false, this value is used to allocate memory and later expanded if more memory is required.

  • blocksize::Int = 0: If blocksize is > 0, the solver assumes that the matrix has a uniformly sized block diagonal structure with square blocks of size blocksize. Misusing this option will lead to incorrect results.

    • If this is set ≤ 0 and during runtime we get a Block Diagonal Matrix, then we will check if the specialized dispatch can be used.
Warning

Most users should be using the KrylovJL_GMRES solver instead of this implementation.

Tip

We can automatically detect if the matrix is a Block Diagonal Matrix with Uniformly Sized Square Blocks. If this is the case, then we can use a specialized dispatch. However, on most modern systems performing a single matrix-vector multiplication is faster than performing multiple smaller matrix-vector multiplications (as in the case of Block Diagonal Matrix). We recommend making the matrix dense (if size permits) and specifying the blocksize argument.

source
Missing docstring.

Missing docstring for DirectLdiv!. Check Documenter's build log for details.

Missing docstring.

Missing docstring for LinearSolveFunction. Check Documenter's build log for details.

FastLapackInterface.jl

FastLapackInterface.jl is a package that allows for a lower-level interface to the LAPACK calls to allow for preallocating workspaces to decrease the overhead of the wrappers. LinearSolve.jl provides a wrapper to these routines in a way where an initialized solver has a non-allocating LU factorization. In theory, this post-initialized solve should always be faster than the Base.LinearAlgebra version. In practice, with the way we wrap the solvers, we do not see a performance benefit and in fact benchmarks tend to show this inhibits performance.

Note

Using this solver requires adding the package FastLapackInterface.jl, i.e. using FastLapackInterface

LinearSolve.FastLUFactorizationType
FastLUFactorization()

A high-performance LU factorization using the FastLapackInterface.jl package. This provides an optimized interface to LAPACK routines with reduced overhead compared to the standard LinearAlgebra LAPACK wrappers.

Features

  • Reduced function call overhead compared to standard LAPACK wrappers
  • Optimized for performance-critical applications
  • Uses partial pivoting (no choice of pivoting method available)
  • Suitable for dense matrices where maximum performance is required

Limitations

  • Does not allow customization of pivoting strategy (always uses partial pivoting)
  • Requires FastLapackInterface.jl to be loaded
  • Limited to dense matrix types supported by LAPACK

Requirements

Using this solver requires that FastLapackInterface.jl is loaded: using FastLapackInterface

Performance Notes

This factorization is optimized for cases where the overhead of standard LAPACK function calls becomes significant, typically for moderate-sized dense matrices or when performing many factorizations.

Example

using FastLapackInterface
alg = FastLUFactorization()
sol = solve(prob, alg)
source
LinearSolve.FastQRFactorizationType
FastQRFactorization{P}(; pivot = ColumnNorm(), blocksize = 36)

A high-performance QR factorization using the FastLapackInterface.jl package. This provides an optimized interface to LAPACK QR routines with reduced overhead compared to the standard LinearAlgebra LAPACK wrappers.

Type Parameters

  • P: The type of pivoting strategy used

Fields

  • pivot::P: Pivoting strategy (e.g., ColumnNorm() for column pivoting, nothing for no pivoting)
  • blocksize::Int: Block size for the blocked QR algorithm (default: 36)

Features

  • Reduced function call overhead compared to standard LAPACK wrappers
  • Supports various pivoting strategies for numerical stability
  • Configurable block size for optimal performance
  • Suitable for dense matrices, especially overdetermined systems

Performance Notes

The block size can be tuned for optimal performance depending on matrix size and architecture. The default value of 36 is generally good for most cases, but experimentation may be beneficial for specific applications.

Requirements

Using this solver requires that FastLapackInterface.jl is loaded: using FastLapackInterface

Example

using FastLapackInterface
# QR with column pivoting
alg1 = FastQRFactorization()  
# QR without pivoting for speed
alg2 = FastQRFactorization(pivot=nothing)
# Custom block size
alg3 = FastQRFactorization(blocksize=64)
source

SuiteSparse.jl

Note

Using this solver requires adding the package SparseArrays.jl, i.e. using SparseArrays

LinearSolve.KLUFactorizationType

KLUFactorization(;reuse_symbolic=true, check_pattern=true)

A fast sparse LU-factorization which specializes on sparsity patterns with “less structure”.

Note

By default, the SparseArrays.jl are implemented for efficiency by caching the symbolic factorization. If the sparsity pattern of A may change between solves, set reuse_symbolic=false. If the pattern is assumed or known to be constant, set reuse_symbolic=true to avoid unnecessary recomputation. To further reduce computational overhead, you can disable pattern checks entirely by setting check_pattern = false. Note that this may error if the sparsity pattern does change unexpectedly.

source
LinearSolve.UMFPACKFactorizationType

UMFPACKFactorization(;reuse_symbolic=true, check_pattern=true)

A fast sparse multithreaded LU-factorization which specializes on sparsity patterns with “more structure”.

Note

By default, the SparseArrays.jl are implemented for efficiency by caching the symbolic factorization. If the sparsity pattern of A may change between solves, set reuse_symbolic=false. If the pattern is assumed or known to be constant, set reuse_symbolic=true to avoid unnecessary recomputation. To further reduce computational overhead, you can disable pattern checks entirely by setting check_pattern = false. Note that this may error if the sparsity pattern does change unexpectedly.

source

Sparspak.jl

Note

Using this solver requires adding the package Sparspak.jl, i.e. using Sparspak

LinearSolve.SparspakFactorizationType

SparspakFactorization(reuse_symbolic = true)

This is the translation of the well-known sparse matrix software Sparspak (Waterloo Sparse Matrix Package), solving large sparse systems of linear algebraic equations. Sparspak is composed of the subroutines from the book "Computer Solution of Large Sparse Positive Definite Systems" by Alan George and Joseph Liu. Originally written in Fortran 77, later rewritten in Fortran 90. Here is the software translated into Julia.

The Julia rewrite is released under the MIT license with an express permission from the authors of the Fortran package. The package uses multiple dispatch to route around standard BLAS routines in the case e.g. of arbitrary-precision floating point numbers or ForwardDiff.Dual. This e.g. allows for Automatic Differentiation (AD) of a sparse-matrix solve.

source

CliqueTrees.jl

Note

Using this solver requires adding the package CliqueTrees.jl, i.e. using CliqueTrees

LinearSolve.CliqueTreesFactorizationType
CliqueTreesFactorization(
    alg = nothing,
    snd = nothing,
    reuse_symbolic = true,
)

The sparse Cholesky factorization algorithm implemented in CliqueTrees.jl. The implementation is pure-Julia and accepts arbitrary numeric types. It is somewhat slower than CHOLMOD.

source

Krylov.jl

LinearSolve.KrylovJL_CGFunction
KrylovJL_CG(args...; kwargs...)

A generic CG implementation for Hermitian and positive definite linear systems

source
LinearSolve.KrylovJL_GMRESFunction
KrylovJL_GMRES(args...; gmres_restart = 0, window = 0, kwargs...)

A generic GMRES implementation for square non-Hermitian linear systems

source
LinearSolve.KrylovJLType
KrylovJL(args...; KrylovAlg = Krylov.gmres!,
    Pl = nothing, Pr = nothing,
    gmres_restart = 0, window = 0,
    kwargs...)

A generic wrapper over the Krylov.jl krylov-subspace iterative solvers.

source

MKL.jl

LinearSolve.MKLLUFactorizationType
MKLLUFactorization()

A wrapper over Intel's Math Kernel Library (MKL). Direct calls to MKL in a way that pre-allocates workspace to avoid allocations and does not require libblastrampoline.

source
LinearSolve.MKL32MixedLUFactorizationType
MKL32MixedLUFactorization()

A mixed precision LU factorization using Intel MKL that performs factorization in Float32 precision while maintaining Float64 interface. This can provide significant speedups for large matrices when reduced precision is acceptable.

Performance Notes

  • Converts Float64 matrices to Float32 for factorization
  • Uses optimized MKL routines for the factorization
  • Can be 2x faster than full precision for memory-bandwidth limited problems
  • May have reduced accuracy compared to full Float64 precision

Requirements

This solver requires MKL to be available through MKL_jll.

Example

alg = MKL32MixedLUFactorization()
sol = solve(prob, alg)
source

OpenBLAS

LinearSolve.OpenBLASLUFactorizationType
OpenBLASLUFactorization()

A direct wrapper over OpenBLAS's LU factorization (getrf! and getrs!). This solver makes direct calls to OpenBLAS_jll without going through Julia's libblastrampoline, which can provide performance benefits in certain configurations.

Performance Characteristics

  • Pre-allocates workspace to avoid allocations during solving
  • Makes direct ccalls to OpenBLAS routines
  • Can be faster than LUFactorization when OpenBLAS is well-optimized for the hardware
  • Supports Float32, Float64, ComplexF32, and ComplexF64 element types

When to Use

  • When you want to ensure OpenBLAS is used regardless of the system BLAS configuration
  • When benchmarking shows better performance than LUFactorization on your specific hardware
  • When you need consistent behavior across different systems (always uses OpenBLAS)

Example

using LinearSolve, LinearAlgebra

A = rand(100, 100)
b = rand(100)
prob = LinearProblem(A, b)
sol = solve(prob, OpenBLASLUFactorization())
source

AppleAccelerate.jl

Note

Using this solver requires a Mac with Apple Accelerate. This should come standard in most "modern" Mac computers.

LinearSolve.AppleAccelerateLUFactorizationType
AppleAccelerateLUFactorization()

A wrapper over Apple's Accelerate Library. Direct calls to Acceelrate in a way that pre-allocates workspace to avoid allocations and does not require libblastrampoline.

source
LinearSolve.AppleAccelerate32MixedLUFactorizationType
AppleAccelerate32MixedLUFactorization()

A mixed precision LU factorization using Apple's Accelerate framework that performs factorization in Float32 precision while maintaining Float64 interface. This can provide significant speedups on Apple hardware when reduced precision is acceptable.

Performance Notes

  • Converts Float64 matrices to Float32 for factorization
  • Uses optimized Accelerate routines for the factorization
  • Particularly effective on Apple Silicon with unified memory
  • May have reduced accuracy compared to full Float64 precision

Requirements

This solver is only available on Apple platforms and requires the Accelerate framework.

Example

alg = AppleAccelerate32MixedLUFactorization()
sol = solve(prob, alg)
source

Metal.jl

Note

Using this solver requires adding the package Metal.jl, i.e. using Metal. This package is only compatible with Mac M-Series computers with a Metal-compatible GPU.

LinearSolve.MetalLUFactorizationType
MetalLUFactorization()

A wrapper over Apple's Metal GPU library for LU factorization. Direct calls to Metal in a way that pre-allocates workspace to avoid allocations and automatically offloads to the GPU. This solver is optimized for Metal-capable Apple Silicon Macs.

Requirements

Using this solver requires that Metal.jl is loaded: using Metal

Performance Notes

  • Most efficient for large dense matrices where GPU acceleration benefits outweigh transfer costs
  • Automatically manages GPU memory and transfers
  • Particularly effective on Apple Silicon Macs with unified memory

Example

using Metal
alg = MetalLUFactorization()
sol = solve(prob, alg)
source
LinearSolve.MetalOffload32MixedLUFactorizationType
MetalOffload32MixedLUFactorization()

A mixed precision Metal GPU-accelerated LU factorization that converts matrices to Float32 before offloading to Metal GPU for factorization, then converts back for the solve. This can provide speedups on Apple Silicon when reduced precision is acceptable.

Performance Notes

  • Converts Float64 matrices to Float32 for GPU factorization
  • Can be significantly faster for large matrices where memory bandwidth is limiting
  • Particularly effective on Apple Silicon Macs with unified memory architecture
  • May have reduced accuracy compared to full precision methods

Requirements

Using this solver requires that Metal.jl is loaded: using Metal

Example

using Metal
alg = MetalOffload32MixedLUFactorization()
sol = solve(prob, alg)
source

Pardiso.jl

Note

Using this solver requires adding the package Pardiso.jl, i.e. using Pardiso

LinearSolve.MKLPardisoFactorizeFunction
MKLPardisoFactorize(; nprocs::Union{Int, Nothing} = nothing,
    matrix_type = nothing,
    cache_analysis = false,
    iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
    dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing)

A sparse factorization method using MKL Pardiso.

Note

Using this solver requires adding the package Pardiso.jl, i.e. using Pardiso

Keyword Arguments

Setting cache_analysis = true disables Pardiso's scaling and matching defaults and caches the result of the initial analysis phase for all further computations with this solver.

For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.

source
LinearSolve.MKLPardisoIterateFunction
MKLPardisoIterate(; nprocs::Union{Int, Nothing} = nothing,
    matrix_type = nothing,
    cache_analysis = false,
    iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
    dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing)

A mixed factorization+iterative method using MKL Pardiso.

Note

Using this solver requires adding the package Pardiso.jl, i.e. using Pardiso

Keyword Arguments

Setting cache_analysis = true disables Pardiso's scaling and matching defaults and caches the result of the initial analysis phase for all further computations with this solver.

For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.

source
LinearSolve.PardisoJLType
PardisoJL(; nprocs::Union{Int, Nothing} = nothing,
    solver_type = nothing,
    matrix_type = nothing,
    iparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
    dparm::Union{Vector{Tuple{Int, Int}}, Nothing} = nothing,
    vendor::Union{Symbol, Nothing} = nothing
)

A generic method using Pardiso. Specifying solver_type is required.

Note

Using this solver requires adding the package Pardiso.jl, i.e. using Pardiso

Keyword Arguments

The vendor keyword allows to choose between Panua pardiso (former pardiso-project.org; vendor=:Panua) and MKL Pardiso (vendor=:MKL). If vendor==nothing, Panua pardiso is preferred over MKL Pardiso.

For the definition of the other keyword arguments, see the Pardiso.jl documentation. All values default to nothing and the solver internally determines the values given the input types, and these keyword arguments are only for overriding the default handling process. This should not be required by most users.

source

CUDA.jl

Note that CuArrays are supported by GenericFactorization in the "normal" way. The following are non-standard GPU factorization routines.

Note

Using these solvers requires adding the package CUDA.jl, i.e. using CUDA

LinearSolve.CudaOffloadLUFactorizationType

CudaOffloadLUFactorization()

An offloading technique used to GPU-accelerate CPU-based computations using LU factorization. Requires a sufficiently large A to overcome the data transfer costs.

Note

Using this solver requires adding the package CUDA.jl, i.e. using CUDA

source
LinearSolve.CudaOffloadQRFactorizationType

CudaOffloadQRFactorization()

An offloading technique used to GPU-accelerate CPU-based computations using QR factorization. Requires a sufficiently large A to overcome the data transfer costs.

Note

Using this solver requires adding the package CUDA.jl, i.e. using CUDA

source
LinearSolve.CUDAOffload32MixedLUFactorizationType

CUDAOffload32MixedLUFactorization()

A mixed precision GPU-accelerated LU factorization that converts matrices to Float32 before offloading to CUDA GPU for factorization, then converts back for the solve. This can provide speedups when the reduced precision is acceptable and memory bandwidth is a bottleneck.

Performance Notes

  • Converts Float64 matrices to Float32 for GPU factorization
  • Can be significantly faster for large matrices where memory bandwidth is limiting
  • May have reduced accuracy compared to full precision methods
  • Most beneficial when the condition number of the matrix is moderate
Note

Using this solver requires adding the package CUDA.jl, i.e. using CUDA

source

AMDGPU.jl

The following are GPU factorization routines for AMD GPUs using the ROCm stack.

Note

Using these solvers requires adding the package AMDGPU.jl, i.e. using AMDGPU

LinearSolve.AMDGPUOffloadLUFactorizationType

AMDGPUOffloadLUFactorization()

An offloading technique using LU factorization to GPU-accelerate CPU-based computations on AMD GPUs. Requires a sufficiently large A to overcome the data transfer costs.

Note

Using this solver requires adding the package AMDGPU.jl, i.e. using AMDGPU

source
LinearSolve.AMDGPUOffloadQRFactorizationType

AMDGPUOffloadQRFactorization()

An offloading technique using QR factorization to GPU-accelerate CPU-based computations on AMD GPUs. Requires a sufficiently large A to overcome the data transfer costs.

Note

Using this solver requires adding the package AMDGPU.jl, i.e. using AMDGPU

source

CUSOLVERRF.jl

Note

Using this solver requires adding the package CUSOLVERRF.jl, i.e. using CUSOLVERRF

LinearSolve.CUSOLVERRFFactorizationType

CUSOLVERRFFactorization(; symbolic = :RF, reuse_symbolic = true)

A GPU-accelerated sparse LU factorization using NVIDIA's cusolverRF library. This solver is specifically designed for sparse matrices on CUDA GPUs and provides high-performance factorization and solve capabilities.

Keyword Arguments

  • symbolic: The symbolic factorization method to use. Options are:
    • :RF (default): Use cusolverRF's built-in symbolic analysis
    • :KLU: Use KLU for symbolic analysis
  • reuse_symbolic: Whether to reuse the symbolic factorization when the sparsity pattern doesn't change (default: true)
Note

This solver requires CUSOLVERRF.jl to be loaded and only supports Float64 element types with Int32 indices.

source

IterativeSolvers.jl

Note

Using these solvers requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

LinearSolve.IterativeSolversJL_CGFunction
IterativeSolversJL_CG(args...; Pl = nothing, Pr = nothing, kwargs...)

A wrapper over the IterativeSolvers.jl CG.

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source
LinearSolve.IterativeSolversJL_GMRESFunction
IterativeSolversJL_GMRES(args...; Pl = nothing, Pr = nothing, gmres_restart = 0, kwargs...)

A wrapper over the IterativeSolvers.jl GMRES.

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source
LinearSolve.IterativeSolversJL_BICGSTABFunction
IterativeSolversJL_BICGSTAB(args...; Pl = nothing, Pr = nothing, kwargs...)

A wrapper over the IterativeSolvers.jl BICGSTAB.

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source
LinearSolve.IterativeSolversJL_MINRESFunction
IterativeSolversJL_MINRES(args...; Pl = nothing, Pr = nothing, kwargs...)

A wrapper over the IterativeSolvers.jl MINRES.

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source
LinearSolve.IterativeSolversJL_IDRSFunction
IterativeSolversJL_IDRS(args...; Pl = nothing, kwargs...)

A wrapper over the IterativeSolvers.jl IDR(S).

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source
LinearSolve.IterativeSolversJLType
IterativeSolversJL(args...;
    generate_iterator = IterativeSolvers.gmres_iterable!,
    Pl = nothing, Pr = nothing,
    gmres_restart = 0, kwargs...)

A generic wrapper over the IterativeSolvers.jl solvers.

Note

Using this solver requires adding the package IterativeSolvers.jl, i.e. using IterativeSolvers

source

KrylovKit.jl

Note

Using these solvers requires adding the package KrylovKit.jl, i.e. using KrylovKit

LinearSolve.KrylovKitJL_CGFunction
KrylovKitJL_CG(args...; Pl = nothing, Pr = nothing, kwargs...)

A generic CG implementation for Hermitian and positive definite linear systems

Note

Using this solver requires adding the package KrylovKit.jl, i.e. using KrylovKit

source
LinearSolve.KrylovKitJL_GMRESFunction
KrylovKitJL_GMRES(args...; Pl = nothing, Pr = nothing, gmres_restart = 0, kwargs...)

A generic GMRES implementation.

Note

Using this solver requires adding the package KrylovKit.jl, i.e. using KrylovKit

source
LinearSolve.KrylovKitJLType
KrylovKitJL(args...; KrylovAlg = Krylov.gmres!, kwargs...)

A generic iterative solver implementation allowing the choice of KrylovKit.jl solvers.

Note

Using this solver requires adding the package KrylovKit.jl, i.e. using KrylovKit

source

HYPRE.jl

Note

Using HYPRE solvers requires Julia version 1.9 or higher, and that the package HYPRE.jl is installed.

LinearSolve.HYPREAlgorithmType

HYPREAlgorithm(solver; Pl = nothing)

HYPRE.jl is an interface to hypre and provide iterative solvers and preconditioners for sparse linear systems. It is mainly developed for large multi-process distributed problems (using MPI), but can also be used for single-process problems with Julias standard sparse matrices.

If you need more fine-grained control over the solver/preconditioner options you can alternatively pass an already created solver to HYPREAlgorithm (and to the Pl keyword argument). See HYPRE.jl docs for how to set up solvers with specific options.

Note

Using HYPRE solvers requires Julia version 1.9 or higher, and that the package HYPRE.jl is installed.

Positional Arguments

The single positional argument solver has the following choices:

  • HYPRE.BiCGSTAB
  • HYPRE.BoomerAMG
  • HYPRE.FlexGMRES
  • HYPRE.GMRES
  • HYPRE.Hybrid
  • HYPRE.ILU
  • HYPRE.ParaSails (as preconditioner only)
  • HYPRE.PCG

Keyword Arguments

  • Pl: A choice of left preconditioner.

Example

For example, to use HYPRE.PCG as the solver, with HYPRE.BoomerAMG as the preconditioner, the algorithm should be defined as follows:

A, b = setup_system(...)
prob = LinearProblem(A, b)
alg = HYPREAlgorithm(HYPRE.PCG)
prec = HYPRE.BoomerAMG
sol = solve(prob, alg; Pl = prec)
source