Derivative free optimization: Difference between revisions
m (Initial Copy from Draft) |
m (Headings) |
||
Line 5: | Line 5: | ||
1. Introduction | == 1. Introduction == | ||
Derivative-Free Optimization (DFO) is a collection of computational methods used to | Derivative-Free Optimization (DFO) is a collection of computational methods used to | ||
Line 43: | Line 42: | ||
2. Algorithm Description | == 2. Algorithm Description == | ||
There are multiple methods of Derivative Free Optimization that are available. While | There are multiple methods of Derivative Free Optimization that are available. While | ||
Line 223: | Line 221: | ||
𝑥1 and 𝑥3 is a small number such as 1E-4. | 𝑥1 and 𝑥3 is a small number such as 1E-4. | ||
== 3. Numerical Examples == | |||
3. Numerical Examples | |||
Example 1: Golden Section Search | Example 1: Golden Section Search | ||
Line 312: | Line 308: | ||
is located. | is located. | ||
4. Application | == 4. Application == | ||
Derivative Free Optimization has a number of real world applications, including in the oil | Derivative Free Optimization has a number of real world applications, including in the oil | ||
Line 440: | Line 435: | ||
in common tools such as Excel, Python (Scipy) and Matlab. | in common tools such as Excel, Python (Scipy) and Matlab. | ||
5. Conclusion | == 5. Conclusion == | ||
Thanks to its robustness, Derivative-Free Optimization has emerged as a useful method | Thanks to its robustness, Derivative-Free Optimization has emerged as a useful method | ||
Line 472: | Line 466: | ||
world while helping humanity solve many of its most important challenges. | world while helping humanity solve many of its most important challenges. | ||
6. References | == 6. References == | ||
[1] Audet, Charles & Dennis, J.. (2000). Analysis of Generalized Pattern Searches. | [1] Audet, Charles & Dennis, J.. (2000). Analysis of Generalized Pattern Searches. | ||
Revision as of 20:25, 10 December 2024
Author: Brian Holcomb (bh597), Alejandra C Ramirez (mac593), Brian Worts (blw87), Trenton Berardi (tnb38), Ifeloluwa Osilaja (io62) (ChemE 6800 Fall 2024)
Stewards: Nathan Preuss, Wei-Han Chen, Tianqi Xiao, Guoqing Hu
1. Introduction
Derivative-Free Optimization (DFO) is a collection of computational methods used to
find optimal values in situations where traditional optimization techniques that rely on
derivatives are impractical or impossible to apply. Instead of requiring derivatives of an
objective function, DFO methods iteratively search through a set of evaluated points to
approximate an optimal solution. This approach is especially useful for black-box models
where derivatives are unavailable, noisy, or computationally expensive to compute, such
as in simulations, machine learning applications, or physical system modeling.
The motivation behind DFO emerged from a need to solve complex optimization problems
in fields like engineering, finance, and logistics, where objective functions are often
opaque or inaccessible for differentiation. Since the early 20th century, methods like
pattern search and genetic algorithms have evolved, providing researchers with tools to
tackle optimization in high-dimensional, uncertain environments.
The primary goal of studying DFO is to enable efficient optimization when traditional
methods fall short, particularly for models that are too complex or costly for gradient-
based methods. DFO techniques have proven essential in areas like supply chain
planning, engineering design, and financial modeling, where optimization can significantly
impact operational efficiency and decision-making.
2. Algorithm Description
There are multiple methods of Derivative Free Optimization that are available. While
DFO algorithms do not require you to calculate the derivative of the objective function,
they rely on the algorithm to narrow the examined range of the objective function over
multiple iterations by selecting points along the objective function and determining if the
local minimum exists between these points.
Two of these methods include Golden Section Search and Successive Parabolic
Interpolation. Both operate in a similar fashion by evaluating the relationship between
three points selected along the objective function and making decisions in order to
narrow the examination window along the objective function where the points are
selected for the next iteration. The main difference is how these three points are
selected and how they are evaluated.
Golden Section Search
Golden Section Search gets its name from the ratio of points selected along the
objective function which aligns closely to the Golden Ratio.
1. Identify a function, f(x), which you would like to find the minimum.
2. Select two points that are on both sides of the function’s minimum. Point a will be
on the left side of the minimum and point b will be on the right side of the
minimum. The section of function, f(x), between points a and b will form the
examination window.
3. Select two more points called 𝑥1 and 𝑥2 as per the equation below. These will be
spaced within the examination window from points a and b by the golden ratio as
defined via constant c.
a. 𝑥1 = 𝑐 ∗ 𝑎 + (1 ― 𝑐) ∗ 𝑏
b. 𝑥2 = (1 ― 𝑐) ∗ 𝑎 + 𝑐 ∗ 𝑏
c. 𝑐 = (―1 + 5)2
4. Evaluate and compare the objective function at 𝑥1 and 𝑥2 .
a. If 𝑓(𝑥1) <= 𝑓(𝑥2) Then
i. Restrict the range of 𝑥 ∈ [𝑎,𝑥2].
ii. 𝑏 = 𝑥2
iii. 𝑥2 = 𝑥1
iv. 𝑓2 = 𝑓1
v. 𝑥1 = 𝑐 ∗ 𝑎 + (1 ― 𝑐) ∗ 𝑏
vi. 𝑓1 = 𝑓(𝑥1)
b. If 𝑓(𝑥1) > 𝑓(𝑥2) Then
i. Restrict the range of 𝑥 ∈ [𝑥1,𝑏].
ii. 𝑎 = 𝑥1
iii. 𝑥1 = 𝑥2
iv. 𝑓1 = 𝑓2
v. 𝑥2 = (1 ― 𝑐) ∗ 𝑎 + 𝑐 ∗ 𝑏
vi. 𝑓2 = 𝑓(𝑥2)
5. After restricting the range as above recreate the variables 𝑎,𝑏,𝑥1,𝑥2 , based on
similar structure as before, but with a narrowed range.
6. Iterate the function for n iteration loops or until the difference between variables a
and b is a small number such as 1E-4.
Four images have been generated to describe the Golden Section Search Algorithm
through 3 iterations.
Successive Parabolic Interpolation
This algorithm leverages building a parabola defined by 3 points selected on a function
f(x). The shape of the parabola is used to narrow the examined range on the function to
find where the function minimum is located. While the derivative of the objective function
is not utilized, the derivative of the parabola is used.
1. Identify a function, f(x), that you wish to examine to find a minimum value.
2. Identify three points on the function, 𝑥1, 𝑥2, 𝑥3 . Two of the points will act as the
outer bounds of the examination region and the third point will be in between
these two.
3. Form a parabola between the three points identified on function f(x). The
parabola can be defined using the Lagrange Interpolation Formula:
a. 𝑝(𝑥) = 𝑓(𝑥1) (𝑥 ― 𝑥2)(𝑥 ― 𝑥3)
(𝑥1 ― 𝑥2)(𝑥1 ― 𝑥3) + 𝑓(𝑥2) (𝑥 ― 𝑥1)(𝑥 ― 𝑥3)
(𝑥2 ― 𝑥1)(𝑥2 ― 𝑥3) + 𝑓(𝑥3) (𝑥 ― 𝑥1)(𝑥 ― 𝑥2)
(𝑥3 ― 𝑥1)(𝑥3 ― 𝑥2)
4. The minimum of the parabola function will be where the first derivative of the
parabola function equals to 0 and will align towards where the minimum of the
function, 𝑓(𝑥), is located.
a. 𝑝′(𝑥0) = 0
b. 𝑥0 = 𝑥1 + 𝑥2
2 ― (𝑓(𝑥2) ― 𝑓(𝑥1))(𝑥3 ― 𝑥1)(𝑥3 ― 𝑥2)
2[(𝑥2 ― 𝑥1)(𝑓(𝑥3) ― 𝑓(𝑥2)) ― (𝑓(𝑥2) ― 𝑓(𝑥1))(𝑥3 ― 𝑥2)]
5. Compare the value of 𝑥0 to 𝑥2 and determine the next step for iteration.
a. If 𝑥0 <= 𝑥2 then shrink the examination range to be 𝑥 ∈ [𝑥1,𝑥2].
i. 𝑥𝑛𝑒𝑤
1 = 𝑥𝑜𝑙𝑑
1
ii. 𝑥𝑛𝑒𝑤
2 = 𝑥0
iii. 𝑥𝑛𝑒𝑤
3 = 𝑥𝑜𝑙𝑑
2
b. If 𝑥0 > 𝑥2 then shrink the examination range to be 𝑥 ∈ [𝑥2,𝑥3].
i. 𝑥𝑛𝑒𝑤
1 = 𝑥𝑜𝑙𝑑
2
ii. 𝑥𝑛𝑒𝑤
2 = 𝑥0
iii. 𝑥𝑛𝑒𝑤
3 = 𝑥𝑜𝑙𝑑
3
6. Iterate the function for n iteration loops or until the difference between variables
𝑥1 and 𝑥3 is a small number such as 1E-4.
3. Numerical Examples
Example 1: Golden Section Search
Suppose we have a function 𝑓(𝑥) = (𝑥 ― 2)2 +1, and we aim to find its minimum within
the interval [𝑎,𝑏] = [0,5].
1. Initial Setup:
○ Set 𝑎 = 0 and 𝑏 = 5
○ Define the golden ratio constant 𝑐 ≈ 0.618
○ Calculate two internal points within [𝑎,𝑏]
■ 𝑥1 = 𝑐 ∗ 𝑎 + (1 ― 𝑐) ∗ 𝑏 = 0.618 ∗ 0 + (1 ― 0.618) ∗ 5 ≈ 1.91
■ 𝑥2 = (1 ― 𝑐) ∗ 𝑎 + 𝑐 ∗ 𝑏 = (1 ― 0.618) ∗ 0 + 0.618 ∗ 5 ≈ 3.09
2. Iterative Evaluation:
○ Evaluate 𝑓(𝑥1) = (1.91 ― 2)2 +1 ≈ and 𝑓(𝑥2) = (3.09 ― 2)2 +1 ≈ 2.19
○ Since 𝑓(𝑥1) ≤ 𝑓(𝑥2), update the interval to [𝑎,𝑥2] = [0,3.09] and set 𝑏 = 𝑥2 = 3.09
○ Recalculate:
■ New 𝑥2 = 𝑥1 ≈ 1.91
■ New 𝑥1 = 𝑐 ∗ 𝑎 +(1 ― 𝑐) ∗ 𝑏 ≈ 1.18
3. Convergence:
○ Repeat this process, recalculating 𝑥1 and 𝑥2 and updating the interval until
the difference between 𝑎 and 𝑏 is less than 10―4 .
○ After a few iterations, the interval narrows around 𝑥 = 2, the minimum of the
function, where 𝑓(𝑥) = 1.
Two images have been generated to describe this numerical example through first
iteration described above.
Example 2: Successive Parabolic Interpolation
Consider the same function 𝑓(𝑥) = (𝑥 ― 2)2 +1 and aim to find its minimum within an
interval. Start with initial points 𝑥1 = 0, 𝑥2 = 2, and 𝑥3 = 4.
1. Initial Setup:
○ Select points: 𝑥1 = 0, 𝑥2 = 2, and 𝑥3 = 4
○ Evaluate 𝑓(𝑥1) = 5, 𝑓(𝑥2) = 1, and 𝑓(𝑥3) = 5
2. Parabolic Interpolation:
○ Construct a parabola 𝑝(𝑥) through points (𝑥1, 𝑓(𝑥1)), (𝑥2, 𝑓(𝑥2)), and (𝑥3,
𝑓(𝑥3)).
○ Using the formula for 𝑥0 where 𝑝′(𝑥) = 0, calculate:
𝑥0 = 𝑥1 + 𝑥2 ― (𝑓(𝑥2) 𝑓(𝑥1))(𝑥3 𝑥1)(𝑥3 𝑥2)
2[(𝑥2 𝑥1)(𝑓(𝑥3) 𝑓(𝑥2)) (𝑓(𝑥2) 𝑓(𝑥1))(𝑥3 𝑥2)]
○ Substituting values gives 𝑥0 ≈ 2, which is close to the minimum.
3. Updating the Interval:
○ Since 𝑥0 ≈ 𝑥2 , shrink the interval to either [𝑥1,𝑥2] or [𝑥2,𝑥3] as needed.
○ Set new points: 𝑥1 = 1, 𝑥2 = 𝑥0 ≈ 2,𝑥3 = 3.
4. Convergence:
○ Repeat the iteration, adjusting points and recalculating 𝑥0 until the
difference between 𝑥1 and 𝑥3 is less than 10―4 .
○ The result converges around 𝑥 = 2, where the function minimum 𝑓(𝑥) = 1
is located.
4. Application
Derivative Free Optimization has a number of real world applications, including in the oil
industry since often the cost functions and constraints require simulations of fluid
mechanics where the gradient information cannot be obtained efficiently. This is an
important problem that needs to be solved since oil and natural gas are the majority
source of the current world’s energy supply so maximizing efficiency has global
implications to the global economy and environments. Specifically, Derivative Free
Optimization can be used, among other techniques, to optimize the production of oil.
This is done by creating discretized simulations of very complex oil-water systems,
using as many known variables as possible like fluid mechanics and seismic effects
(Ciaurri, 2011)4. Thus, the minimization problem and its constraints become known and
a search of the gradients becomes possible. Two such derivative-free optimization
methods are the generalized pattern search and the Hookes-Jeeves Direct Search
(Hooke, 1961)5. The generalized pattern search is a search algorithm for “derivative-
free unconstrained optimization” divided into “search” and “poll” phases (Audet, 2000)1.
In the search phase, a finite number of points are evaluated to find a minimum on a
mesh. The poll phase is then used to poll the neighboring points on the mesh to find a
better minimum before the process is repeated using a more refined mesh or an iterated
search. The Hookes-Jeeves Direct Search method is a “Compass-based pattern search
method” with an exploratory mode and a pattern mode, similar to that of the generalized
pattern search (Hooke, 1961)5. In the exploratory mode, the local area is searched to
refine the optima estimate while the pattern phase is used to identify patterns in the
exploratory phase moves, which results in a better guess in the direction of the optima.
Using these search methods to analyze their analysis results, the optimal control
scheme to optimize oil production was determined. This was done by simulating using
constraints for the water quantity, water to oil ratio, and resulting oil output while they
searched for the optimal water pressure to use that minimizes cost (Ciaurri, 2011) 4.
They varied the configurations as well as utilized variations of the generalized pattern
search and the Hookes-Jeeves Direct Search to search their gradients and find the
optima. This work resulted in finding a feasible and optimal configuration scheme,
expected profits, an increase in oil production, a decrease in water consumption, and
thus showed how derivative-free computational optimization techniques can be applied
to improve industries.
Another real-world application of Derivative Free Optimization is for the minimization of
the number of dampers to use to reduce the damage that tall buildings cause to each
other during an earthquake in dense urban environments. Such events can put
thousands of lives in danger and have caused billions of dollars in damage. Dampers
are used to join adjacent buildings together to reduce the chance that they collide and
increase the structural stability overall, but the upfront cost is extremely high so it is
desirable to minimize the number of dampers that are required to put the building’s risk
of damage in an earthquake within acceptable limits. Derivative Free Optimization, and
specifically Model-Based Derivative Free Optimization, is the optimal choice for solving
this kind of problem since the objective function is a “numerical integration” generated
through an earthquake simulator that results in the optimizer not having access to the
objective function gradients (Audet, 2017) 2. Using Model-Based Derivative Free
Optimization consistently beats other methods in its efficiency of earthquake retrofitting
optimization like Genetic Algorithms and Nelder-Mead since, in this application, the
objective function appears to be differentiable which makes it an ideal use case (Audet,
2017) 2.
One application that highly relies on derivative-free optimization is the medical
experimentation field. In the medical experimentation field derivative free optimization is
often needed due to the complexity, noisy, non-differentiable functions or lack of ability
to acquire data to obtain functions. An example of this is in cancer treatment where
chemotherapy regimens involve 20+ independent variables with vast amounts of
interdependent biological responses. Many biological responses are non-linear.
Collection of testing combinations of these chemotherapy, dosage amounts, and
interval timing between doses are not feasible; therefore there is not an ability to create
a response function in order to use gradient optimization. Nelder and Mead (1964) and
Box (1964) methods are example derivative-free optimization algorithms that have been
used in this chemotherapy optimization problem. These are examples of Direct Search
Methods (DSM) that help narrow the field of combinations of variables. A case study of
this can be found in Berenbaum MC. “Direct search methods in the optimisation of
cancer chemotherapy regimens (1990).”3 Nealder-Mead is a common optimizer option
in common tools such as Excel, Python (Scipy) and Matlab.
5. Conclusion
Thanks to its robustness, Derivative-Free Optimization has emerged as a useful method
of solving complex optimization problems where traditional methods that require
derivatives to be available are not practical. This method of optimization is limited by the
high computation cost, sensitivity to noise that leads to imperfect results, and the lack of
gradient information that leads to slower convergence. However, advancements in
computational power and algorithmic design are helping to mitigate these challenges,
making DFO methods more efficient and accessible.
There are numerous applications of Derivative-Free Optimization including in the oil,
architecture, and medical industries where solving optimization problems has
implications on global and local economies, the environment, and people’s health.
These examples demonstrate the usefulness that computational optimization has on
humanity and its importance in our continuingly complex modern world. As the world
continues to grow, environments continue to change, and better data becomes
available, the methods of Derivative-Free Optimization will continue to grow with the
world while helping humanity solve many of its most important challenges.
6. References
[1] Audet, Charles & Dennis, J.. (2000). Analysis of Generalized Pattern Searches.
SIAM Journal on Optimization. 13. 10.1137/S1052623400378742.
[2] Audet, C., & Hare, W. (2017). Derivative-Free and Blackbox Optimization. Springer.
[3] Berenbaum MC. Direct search methods in the optimisation of cancer chemotherapy
regimens. Br J Cancer. 1990 Jan;61(1):101-9. doi: 10.1038/bjc.1990.22. Erratum in: Br
J Cancer 1990 May;61(5):788. PMID: 2297481; PMCID: PMC1971319.
https://pmc.ncbi.nlm.nih.gov/articles/PMC1971319/
[4] Ciaurri, D.E., Mukerji, T., Durlofsky, L.J. (2011). Derivative-Free Optimization for Oil
Field Operations. In: Yang, XS., Koziel, S. (eds) Computational Optimization and
Applications in Engineering and Industry. Studies in Computational Intelligence, vol 359.
Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20986-4_2
[5] Hooke, R., Jeeves, T.A.: Direct search solution of numerical and statistical problems. Journal of the ACM 8(2), 212–229 (1961)