Solve your first optimization problem
Numerical optimization is the process of finding some numerical values that minimize some equation.
- How much fuel should you put into an airplane to have the minimum weight that can go to its destination?
- What parameters should I choose for my simulation so that it minimizes the distance of its predictions from my experimental data?
All of these are examples of problems solved by numerical optimization. Let's solve our first optimization problem!
Required Dependencies
The following parts of the SciML Ecosystem will be used in this tutorial:
Module | Description |
---|---|
Optimization.jl | The numerical optimization package |
OptimizationNLopt.jl | The NLopt optimizers we will use |
ForwardDiff.jl | The automatic differentiation library for gradients |
Problem Setup
First, what are we solving? Let's take a look at the Rosenbrock equation:
\[L(u,p) = (p_1 - u_1)^2 + p_2 * (u_2 - u_1^2)^2\]
What we want to do is find the values of $u_1$ and $u_2$ such that $L$ achieves its minimum value possible. We will do this under a few constraints: we want to find this optimum within some bounded domain, i.e. $u_i \in [-1,1]$. This should be done with the parameter values $p_1 = 1.0$ and $p_2 = 100.0$. What should $u = [u_1,u_2]$ be to achieve this goal? Let's dive in!
The upper and lower bounds are optional for the solver! If your problem does not need to have such bounds, just leave off the parts with lb
and ub
!
Copy-Pastable Code
# Import the package
using Optimization, OptimizationNLopt, ForwardDiff
# Define the problem to optimize
L(u, p) = (p[1] - u[1])^2 + p[2] * (u[2] - u[1]^2)^2
u0 = zeros(2)
p = [1.0, 100.0]
optfun = OptimizationFunction(L, Optimization.AutoForwardDiff())
prob = OptimizationProblem(optfun, u0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
# Solve the optimization problem
sol = solve(prob, NLopt.LD_LBFGS())
# Analyze the solution
@show sol.u, L(sol.u, p)
([1.0, 1.0], 0.0)
Step-by-Step Solution
Step 1: Import the packages
To do this tutorial, we will need a few components:
- Optimization.jl, the optimization interface.
- OptimizationNLopt.jl, the optimizers we will use.
- ForwardDiff.jl, the automatic differentiation library for gradients
Note that Optimization.jl is an interface for optimizers, and thus we always have to choose which optimizer we want to use. Here we choose to demonstrate OptimizationNLopt
because of its efficiency and versatility. But there are many other possible choices. Check out the solver compatibility chart for a quick overview of what optimizer packages offer.
To start, let's add these packages as demonstrated in the installation tutorial:
using Pkg
Pkg.add(["Optimization", "OptimizationNLopt", "ForwardDiff"])
Now we're ready. Let's load in these packages:
using Optimization, OptimizationNLopt, ForwardDiff
Step 2: Define the Optimization Problem
Now let's define our problem to optimize. We start by defining our loss function. In Optimization.jl's OptimizationProblem
interface, the states are given by an array u
. Thus we can designate u[1]
to be u_1
and u[2]
to be u_2
, similarly with our parameters, and write out the loss function on a vector-defined state as follows:
# Define the problem to optimize
L(u, p) = (p[1] - u[1])^2 + p[2] * (u[2] - u[1]^2)^2
L (generic function with 1 method)
Next we need to create an OptimizationFunction
where we tell Optimization.jl to use the ForwardDiff.jl package for creating the gradient and other derivatives required by the optimizer.
#Create the OptimizationFunction
optfun = OptimizationFunction(L, Optimization.AutoForwardDiff())
(::SciMLBase.OptimizationFunction{true, ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.L), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}) (generic function with 1 method)
Now we need to define our OptimizationProblem
. If you need help remembering how to define the OptimizationProblem
, you can always refer to the Optimization.jl problem definition page.
Thus what we need to define is an initial condition u0
and our parameter vector p
. We will make our initial condition have both values as zero, which is done by the Julia shorthand zeros(2)
that creates a vector [0.0,0.0]
. We manually define the parameter vector p
to input our values. Then we set the lower bound and upper bound for the optimization as follows:
u0 = zeros(2)
p = [1.0, 100.0]
prob = OptimizationProblem(optfun, u0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
OptimizationProblem. In-place: true
u0: 2-element Vector{Float64}:
0.0
0.0
Note about defining uniform bounds
Note that we can simplify the code a bit for the lower and upper bound definition by using the Julia Base command ones
, which returns a vector where each value is a one. Thus for example, ones(2)
is equivalent to [1.0,1.0]
. Therefore -1 * ones(2)
is equivalent to [-1.0,-1.0]
, meaning we could have written our problem as follows:
prob = OptimizationProblem(optfun, u0, p, lb = -1 * ones(2), ub = ones(2))
OptimizationProblem. In-place: true
u0: 2-element Vector{Float64}:
0.0
0.0
Step 3: Solve the Optimization Problem
Now we solve the OptimizationProblem
that we have defined. This is done by passing our OptimizationProblem
along with a chosen solver to the solve
command. At the beginning, we explained that we will use the OptimizationNLopt
set of solvers, which are documented in the OptimizationNLopt page. From here, we are choosing the NLopt.LD_LBFGS()
for its mixture of robustness and performance. To perform this solve, we do the following:
# Solve the optimization problem
sol = solve(prob, NLopt.LD_LBFGS())
retcode: Success
u: 2-element Vector{Float64}:
1.0
1.0
Step 4: Analyze the Solution
Now let's check out the solution. First of all, what kind of thing is the sol
? We can see that by asking for its type:
typeof(sol)
SciMLBase.OptimizationSolution{Float64, 1, Vector{Float64}, OptimizationBase.OptimizationCache{SciMLBase.OptimizationFunction{true, ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.L), OptimizationBase.var"#grad#16"{Vector{Float64}, SciMLBase.OptimizationFunction{true, ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.L), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, ADTypes.AutoForwardDiff{nothing, Nothing}}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, OptimizationBase.ReInitCache{Vector{Float64}, Vector{Float64}}, Vector{Float64}, Vector{Float64}, Nothing, Nothing, Nothing, NLopt.Algorithm, Bool, OptimizationNLopt.var"#2#4", Nothing}, NLopt.Algorithm, Float64, NLopt.Opt, SciMLBase.OptimizationStats}
From this, we can see that it is an OptimizationSolution
. We can see the documentation for how to use the OptimizationSolution
by checking the Optimization.jl solution type page. For example, the solution is stored as .u
. What is the solution to our optimization, and what is the final loss value? We can check it as follows:
# Analyze the solution
@show sol.u, L(sol.u, p)
([1.0, 1.0], 0.0)