Optim.jl
Optim is Julia package implementing various algorithms to perform univariate and multivariate optimization.
Installation: OptimizationOptimJL.jl
To use this package, install the OptimizationOptimJL package:
import Pkg;
Pkg.add("OptimizationOptimJL");Methods
Optim.jl algorithms can be one of the following:
Optim.NelderMead()Optim.SimulatedAnnealing()Optim.ParticleSwarm()Optim.ConjugateGradient()Optim.GradientDescent()Optim.BFGS()Optim.LBFGS()Optim.NGMRES()Optim.OACCEL()Optim.NewtonTrustRegion()Optim.Newton()Optim.KrylovTrustRegion()Optim.ParticleSwarm()Optim.SAMIN()
Each optimizer also takes special arguments which are outlined in the sections below.
The following special keyword arguments which are not covered by the common solve arguments can be used with Optim.jl optimizers:
x_tol: Absolute tolerance in changes of the input vectorx, in infinity norm. Defaults to0.0.g_tol: Absolute tolerance in the gradient, in infinity norm. Defaults to1e-8. For gradient free methods, this will control the main convergence tolerance, which is solver-specific.f_calls_limit: A soft upper limit on the number of objective calls. Defaults to0(unlimited).g_calls_limit: A soft upper limit on the number of gradient calls. Defaults to0(unlimited).h_calls_limit: A soft upper limit on the number of Hessian calls. Defaults to0(unlimited).allow_f_increases: Allow steps that increase the objective value. Defaults tofalse. Note that, when setting this totrue, the last iterate will be returned as the minimizer even if the objective increased.store_trace: Should a trace of the optimization algorithm's state be stored? Defaults tofalse.show_trace: Should a trace of the optimization algorithm's state be shown onstdout? Defaults tofalse.extended_trace: Save additional information. Solver dependent. Defaults tofalse.trace_simplex: Include the full simplex in the trace forNelderMead. Defaults tofalse.show_every: Trace output is printed everyshow_everyth iteration.
For a more extensive documentation of all the algorithms and options, please consult the Documentation
Local Optimizer
Local Constraint
Optim.jl implements the following local constraint algorithms:
Optim.IPNewton()μ0specifies the initial barrier penalty coefficient as either a number or:autoshow_linesearchis an option to turn on linesearch verbosity.Defaults:
linesearch::Function = Optim.backtrack_constrained_gradμ0::Union{Symbol,Number} = :autoshow_linesearch::Bool = false
The Rosenbrock function with constraints can be optimized using the Optim.IPNewton() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
cons = (res, x, p) -> res .= [x[1]^2 + x[2]^2]
x0 = zeros(2)
p = [1.0, 100.0]
prob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff(); cons = cons)
prob = Optimization.OptimizationProblem(prob, x0, p, lcons = [-5.0], ucons = [10.0])
sol = solve(prob, IPNewton())retcode: Success
u: 2-element Vector{Float64}:
0.9999999992669327
0.9999999985109471See also in the Optim.jl documentation the Nonlinear constrained optimization example using IPNewton.
Derivative-Free
Derivative-free optimizers are optimizers that can be used even in cases where no derivatives or automatic differentiation is specified. While they tend to be less efficient than derivative-based optimizers, they can be easily applied to cases where defining derivatives is difficult. Note that while these methods do not support general constraints, all support bounds constraints via lb and ub in the Optimization.OptimizationProblem.
Optim.jl implements the following derivative-free algorithms:
Optim.NelderMead(): Nelder-Mead optimizersolve(problem, NelderMead(parameters, initial_simplex))parameters = AdaptiveParameters()orparameters = FixedParameters()initial_simplex = AffineSimplexer()Defaults:
parameters = AdaptiveParameters()initial_simplex = AffineSimplexer()
Optim.SimulatedAnnealing(): Simulated Annealingsolve(problem, SimulatedAnnealing(neighbor, T, p))neighboris a mutating function of the current and proposedxTis a function of the current iteration that returns a temperaturepis a function of the current temperatureDefaults:
neighbor = default_neighbor!T = default_temperaturep = kirkpatrick
The Rosenbrock function can be optimized using the Optim.NelderMead() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
prob = Optimization.OptimizationProblem(rosenbrock, x0, p)
sol = solve(prob, Optim.NelderMead())retcode: Success
u: 2-element Vector{Float64}:
0.9999634355313174
0.9999315506115275Gradient-Based
Gradient-based optimizers are optimizers which utilize the gradient information based on derivatives defined or automatic differentiation.
Optim.jl implements the following gradient-based algorithms:
Optim.ConjugateGradient(): Conjugate Gradient Descentsolve(problem, ConjugateGradient(alphaguess, linesearch, eta, P, precondprep))alphaguesscomputes the initial step length (for more information, consult this source and this example)- available initial step length procedures:
InitialPreviousInitialStaticInitialHagerZhangInitialQuadraticInitialConstantChange
linesearchspecifies the line search algorithm (for more information, consult this source and this example)- available line search algorithms:
HaegerZhangMoreThuenteBackTrackingStrongWolfeStatic
etadetermines the next step directionPis an optional preconditioner (for more information, see this source)precondpredis used to updatePas the state variablexchangesDefaults:
alphaguess = LineSearches.InitialHagerZhang()linesearch = LineSearches.HagerZhang()eta = 0.4P = nothingprecondprep = (P, x) -> nothing
Optim.GradientDescent(): Gradient Descent (a quasi-Newton solver)solve(problem, GradientDescent(alphaguess, linesearch, P, precondprep))alphaguesscomputes the initial step length (for more information, consult this source and this example)- available initial step length procedures:
InitialPreviousInitialStaticInitialHagerZhangInitialQuadraticInitialConstantChange
linesearchspecifies the line search algorithm (for more information, consult this source and this example)- available line search algorithms:
HaegerZhangMoreThuenteBackTrackingStrongWolfeStatic
Pis an optional preconditioner (for more information, see this source)precondpredis used to updatePas the state variablexchangesDefaults:
alphaguess = LineSearches.InitialPrevious()linesearch = LineSearches.HagerZhang()P = nothingprecondprep = (P, x) -> nothing
Optim.BFGS(): Broyden-Fletcher-Goldfarb-Shanno algorithmsolve(problem, BFGS(alphaguess, linesearch, initial_invH, initial_stepnorm, manifold))alphaguesscomputes the initial step length (for more information, consult this source and this example)- available initial step length procedures:
InitialPreviousInitialStaticInitialHagerZhangInitialQuadraticInitialConstantChange
linesearchspecifies the line search algorithm (for more information, consult this source and this example)- available line search algorithms:
HaegerZhangMoreThuenteBackTrackingStrongWolfeStatic
initial_invHspecifies an optional initial matrixinitial_stepnormdetermines thatinitial_invHis an identity matrix scaled by the value ofinitial_stepnormmultiplied by the sup-norm of the gradient at the initial pointmanifoldspecifies a (Riemannian) manifold on which the function is to be minimized (for more information, consult this source)- available manifolds:
FlatSphereStiefel- meta-manifolds:
PowerManifoldProductManifold- custom manifolds
Defaults:
alphaguess = LineSearches.InitialStatic()linesearch = LineSearches.HagerZhang()initial_invH = nothinginitial_stepnorm = nothingmanifold = Flat()
Optim.LBFGS(): Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithmmis the number of history pointsalphaguesscomputes the initial step length (for more information, consult this source and this example)- available initial step length procedures:
InitialPreviousInitialStaticInitialHagerZhangInitialQuadraticInitialConstantChange
linesearchspecifies the line search algorithm (for more information, consult this source and this example)- available line search algorithms:
HaegerZhangMoreThuenteBackTrackingStrongWolfeStatic
Pis an optional preconditioner (for more information, see this source)precondpredis used to updatePas the state variablexchangesmanifoldspecifies a (Riemannian) manifold on which the function is to be minimized (for more information, consult this source)- available manifolds:
FlatSphereStiefel- meta-manifolds:
PowerManifoldProductManifold- custom manifolds
scaleinvH0: whether to scale the initial Hessian approximationDefaults:
m = 10alphaguess = LineSearches.InitialStatic()linesearch = LineSearches.HagerZhang()P = nothingprecondprep = (P, x) -> nothingmanifold = Flat()scaleinvH0::Bool = true && (P isa Nothing)
The Rosenbrock function can be optimized using the Optim.LBFGS() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
optprob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(optprob, x0, p, lb = [-1.0, -1.0], ub = [0.8, 0.8])
sol = solve(prob, Optim.LBFGS())retcode: Success
u: 2-element Vector{Float64}:
0.799999998888889
0.6399999982096882Hessian-Based Second Order
Hessian-based optimization methods are second order optimization methods which use the direct computation of the Hessian. These can converge faster, but require fast and accurate methods for calculating the Hessian in order to be appropriate.
Optim.jl implements the following hessian-based algorithms:
Optim.NewtonTrustRegion(): Newton Trust Region methodinitial_delta: The starting trust region radiusdelta_hat: The largest allowable trust region radiuseta: When rho is at least eta, accept the step.rho_lower: When rho is less than rho_lower, shrink the trust region.rho_upper: When rho is greater than rhoupper, grow the trust region (though no greater than deltahat).Defaults:
initial_delta = 1.0delta_hat = 100.0eta = 0.1rho_lower = 0.25rho_upper = 0.75
Optim.Newton(): Newton's method with line searchalphaguesscomputes the initial step length (for more information, consult this source and this example)- available initial step length procedures:
InitialPreviousInitialStaticInitialHagerZhangInitialQuadraticInitialConstantChange
linesearchspecifies the line search algorithm (for more information, consult this source and this example)- available line search algorithms:
HaegerZhangMoreThuenteBackTrackingStrongWolfeStatic
Defaults:
alphaguess = LineSearches.InitialStatic()linesearch = LineSearches.HagerZhang()
The Rosenbrock function can be optimized using the Optim.Newton() as follows:
using Optimization, OptimizationOptimJL, ModelingToolkit
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock, Optimization.AutoModelingToolkit())
prob = Optimization.OptimizationProblem(f, x0, p)
sol = solve(prob, Optim.Newton())retcode: Success
u: 2-element Vector{Float64}:
0.9999999999999994
0.9999999999999989Hessian-Free Second Order
Hessian-free methods are methods which perform second order optimization by direct computation of Hessian-vector products (Hv) without requiring the construction of the full Hessian. As such, these methods can perform well for large second order optimization problems, but can require special case when considering conditioning of the Hessian.
Optim.jl implements the following hessian-free algorithms:
Optim.KrylovTrustRegion(): A Newton-Krylov method with Trust Regionsinitial_delta: The starting trust region radiusdelta_hat: The largest allowable trust region radiuseta: When rho is at least eta, accept the step.rho_lower: When rho is less than rho_lower, shrink the trust region.rho_upper: When rho is greater than rhoupper, grow the trust region (though no greater than deltahat).Defaults:
initial_delta = 1.0delta_hat = 100.0eta = 0.1rho_lower = 0.25rho_upper = 0.75
The Rosenbrock function can be optimized using the Optim.KrylovTrustRegion() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
optprob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(optprob, x0, p)
sol = solve(prob, Optim.KrylovTrustRegion())retcode: Success
u: 2-element Vector{Float64}:
0.999999999999108
0.9999999999981819Global Optimizer
Without Constraint Equations
The following method in Optim performs global optimization on problems with or without box constraints. It works both with and without lower and upper bounds set by lb and ub in the Optimization.OptimizationProblem.
Optim.ParticleSwarm(): Particle Swarm Optimizationsolve(problem, ParticleSwarm(lower, upper, n_particles))lower/upperare vectors of lower/upper bounds respectivelyn_particlesis the number of particles in the swarm- defaults to:
lower = [],upper = [],n_particles = 0
The Rosenbrock function can be optimized using the Optim.ParticleSwarm() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock)
prob = Optimization.OptimizationProblem(f, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, Optim.ParticleSwarm(lower = prob.lb, upper = prob.ub, n_particles = 100))retcode: Failure
u: 2-element Vector{Float64}:
1.0
1.0With Constraint Equations
The following method in Optim performs global optimization on problems with box constraints.
Optim.SAMIN(): Simulated Annealing with boundssolve(problem, SAMIN(nt, ns, rt, neps, f_tol, x_tol, coverage_ok, verbosity))Defaults:
nt = 5ns = 5rt = 0.9neps = 5f_tol = 1e-12x_tol = 1e-6coverage_ok = falseverbosity = 0
The Rosenbrock function can be optimized using the Optim.SAMIN() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(f, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, Optim.SAMIN())retcode: Failure
u: 2-element Vector{Float64}:
0.9543334041440239
0.9051990486563493