Derivative free optimization

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Jump to navigation Jump to search

Author: Brian Holcomb (bh597), Alejandra Chaparro Ramirez (mac593), Brian Worts (blw87), Trenton Berardi (tnb38), Ifeloluwa Osilaja (io62) (ChemE 6800 Fall 2024)

Stewards: Nathan Preuss, Wei-Han Chen, Tianqi Xiao, Guoqing Hu


Introduction

Derivative-Free Optimization (DFO) is a collection of computational methods used to find optimal values in situations where traditional optimization techniques that rely on derivatives are impractical or impossible to apply and is a type of black box optimization (Audet, 2017)[1]. Instead of requiring derivatives of an objective function, DFO methods iteratively search through a set of evaluated points to approximate an optimal solution. This approach is especially useful for black-box models where derivatives are unavailable, noisy, or computationally expensive to compute, most commonly such as in simulations, machine learning applications, or physical system modeling (Audet, 2017)[1].

The motivation behind DFO emerged from a need to solve complex optimization problems in fields like engineering, finance, and logistics, where objective functions are often opaque or inaccessible for differentiation (Conn, 2009)[2]. Since the early 20th century, methods like pattern search and genetic algorithms have evolved, providing researchers with tools to tackle optimization in high-dimensional, uncertain environments. In fact, until relatively recently computing derivatives was the single most common source of user error in applying optimization software(Conn, 2009)[2]. The primary goal of studying DFO is to enable efficient optimization when traditional methods fall short, particularly for models that are too complex or costly for gradient-based methods. DFO techniques have proven essential in areas like supply chain planning, engineering design, and financial modeling, where optimization can significantly impact operational efficiency and decision-making.

Algorithm Description

There are multiple methods of Derivative-Free Optimization that are available. While DFO algorithms do not require one to calculate the derivative of the objective function, they rely on the algorithm to narrow the examined range of the objective function over multiple iterations by selecting points along the objective function and determining if the local minimum exists between these points. Two of these methods include Golden Section Search and Successive Parabolic Interpolation. Both operate in a similar fashion by evaluating the relationship between three points selected along the objective function and making decisions in order to narrow the examination window along the objective function where the points are selected for the next iteration. The main difference is how these three points are selected and how they are evaluated.

Golden Section Search

Example of Golden Search Algorithm through three iterations on function

Golden Section Search gets its name from the ratio of points selected along the objective function which aligns closely to the Golden Ratio.

Step 1: Identify a function, , with the objective to find the minimum value over a given range.

Step 2: Select two points that are on both sides of the function’s minimum. The first point, , will be on the left side of the minimum and the second point, , will be on the right side of the minimum. The section of function, , between points and will form the examination window.

Step 3: Select two more points named and as per the equations below. These will be spaced within the examination window between points and by the golden ratio as defined via constant .

Step 4: Evaluate and compare the objective function at and .

Step 4a: If , restrict the range of .
Step 4b: if , restrict the range of .

Step 5: Repeat step 3 and step 4 until one of the following criteria is met:

  1. The number of iterations surpasses the target number, .
  2. The difference between variables and is a small number such as 1E-4.


Note: This algorithm will not produce the exact value for the function minimum. It will predict that the minimum will be between and .

Figure 5. Flow Chart describing the Golden Ratio Algorithm

Successive Parabolic Interpolation

This algorithm leverages building a parabola defined by 3 points selected on a function . The shape of the parabola is used to narrow the examined range on the function to find where the function minimum is located. While the derivative of the objective function is not utilized, the derivative of the parabola is used.

Step 1: Identify a function, , with the objective to find the minimum value.

Step 2: Identify three points on the function, ,,. Two of the points will act as the outer bounds of the examination region and the third point will be in between these two.

Step 3: Form a parabola between the three points identified on function . The parabola can be defined using the Lagrange Interpolation Formula:

Step 4: The minimum of the parabola function will be where the first derivative of the parabola function equals to 0 and will align towards where the minimum of the function, , is located.

Step 5: Compare the value of to and determine the next step for iteration.

Step 5a: If then shrink the examination range to be .
Step 5b: If then shrink the examination range to be .

Step 6: Repeat step 4 and step 5 until one of the following criteria is met:

  1. The number of iterations surpasses the target number, .
  2. The difference between variables and is a small number such as 1E-4.


Note: This algorithm will not produce the exact value for the function minimum. It will predict that the minimum will be between and .

Flow Chart describing the Successive Parabolic Interpolation Algorithm

Numerical Examples

Example 1: Golden Section Search

Example of Golden Ratio Search on .

Suppose there is a function, and the aim is to find its minimum within the interval .

Initial Setup:

Set and
Define the golden ratio constant
Calculate two internal points within

Iterative Evaluation:

Evaluate and
Since , update the interval to and set
Recalculate:
New
New

Convergence:

Repeat this process, recalculating and and updating the interval until the difference between and is less than 1E-4 .
After a few iterations, the interval narrows around , the minimum of the function, where .

Example 2: Successive Parabolic Interpolation

Consider the same function and aim to find its minimum within an interval. Start with initial points , , and .

Initial Setup:

Select points: , , and
Evaluate , , and

Parabolic Interpolation:

Construct a parabola through points , and .
Using the formula for where , calculate:
Substituting values gives , which is close to the minimum.

Updating the Interval:

Since , shrink the interval to either or as needed.
Set new points: , , .

Convergence:

Repeat the iteration, adjusting points and recalculating 𝑥0 until the difference between and is less than 1E-4 .
The result converges around , where the function minimum is located.


Applications

Oil and Gas

Derivative-Free Optimization (DFO) has a number of real world applications, including in the oil industry since often the cost functions and constraints require simulations of fluid mechanics where the gradient information cannot be obtained efficiently. This is an important problem that needs to be solved since oil and natural gas make up a significant source of the world’s current energy supply, so maximizing efficiency has global implications to the global economy and environments (IEA, 2021)[3]. Specifically, DFO can be used, among other techniques, to optimize the production of oil. This is done by creating discretized simulations of very complex oil-water systems, using as many known variables as possible like fluid mechanics and seismic effects (Ciaurri, 2011)[4]. Thus, the minimization problem and its constraints become known and a search of the gradients becomes possible. Two such DFO methods are the generalized pattern search and the Hookes-Jeeves Direct Search (Hooke, 1961)[5]. The generalized pattern search is a search algorithm for “derivative-free unconstrained optimization” divided into “search” and “poll” phases (Audet, 2000)[6]. In the search phase, a finite number of points are evaluated to find a minimum on a mesh. The poll phase is then used to poll the neighboring points on the mesh to find a better minimum before the process is repeated using a more refined mesh or an iterated search. The Hookes-Jeeves Direct Search method is a “Compass-based pattern search method” with an exploratory mode and a pattern mode, similar to that of the generalized pattern search (Hooke, 1961)[5]. In the exploratory mode, the local area is searched to refine the optima estimate while the pattern phase is used to identify patterns in the exploratory phase moves, which results in a better guess in the direction of the optima (Hooke, 1961)[5]. Using these search methods to analyze their analysis results, the optimal control scheme to optimize oil production was determined. This was done by simulating using constraints for the water quantity, water to oil ratio, and resulting oil output while they searched for the optimal water pressure to use that minimizes cost (Ciaurri, 2011)[4]. They varied the configurations as well as utilized variations of the generalized pattern search and the Hookes-Jeeves Direct Search to search their gradients and find the optima. This work resulted in finding a feasible and optimal configuration scheme, expected profits, an increase in oil production, a decrease in water consumption, and thus showed how DFO computational techniques can be applied to improve industries.

Engineering

Another real-world application of DFO is for the minimization of the number of seismic dampers to use to reduce the damage that tall buildings cause to each other during an earthquake in dense urban environments. Such events can put thousands of lives in danger and caused billions of dollars in damage in 2023 (USGS, 2023)[7]. Shock absorbing dampers, originally developed by NASA for the Apollo programs, are used to in building construction to join adjacent buildings together which reduces the chance that buildings collide while increasing the structural stability overall (NASA, 2015)[8]. The upfront cost of seismic dampers is extremely high, so it is desirable to minimize the number of dampers that are required to put the building’s risk of damage in an earthquake within acceptable limits. Thus, DFO, and specifically Model-Based DFO, is the optimal choice for solving this kind of problem since the objective function is a “numerical integration” generated through an earthquake simulator that results in the optimizer not having access to the objective function gradients (Audet, 2017)[1]. Other optimization applications in this field, such as in minimizing peak floor accelerations in multi-story buildings when subjected to earthquakes, may also benefit from DFO as can resolve the issues of noisy or discontinuous objective functions that pop up in these applications (Ortega, 24)[9].

Medical

Using Model-Based DFO consistently beats other methods in its efficiency of earthquake retrofitting optimization like Genetic Algorithms and Nelder-Mead since, in this application, the objective function appears to be differentiable which makes it an ideal use case (Audet, 2017)[1]. One application that highly relies on DFO is the medical experimentation field. In the medical experimentation field DFO is often needed due to the complexity, noisy, non-differentiable functions or lack of ability to acquire data to obtain functions. An example of this is in cancer treatment where chemotherapy regimens involve 20+ independent variables with vast amounts of interdependent biological responses. Many biological responses are non-linear. Collection of testing combinations of these chemotherapy, dosage amounts, and interval timing between doses are not feasible; therefore there is not an ability to create a response function in order to use gradient optimization. Nelder and Mead (1964) and Box (1964) methods are example DFO algorithms that have been used in this chemotherapy optimization problem. These are examples of Direct Search Methods (DSM) that help narrow the field of combinations of variables. A case study of this can be found in Berenbaum MC. “Direct search methods in the optimisation of cancer chemotherapy regimens (1990).”[10] Nealder-Mead is a common optimizer option in common tools such as Excel, Python (SciPy)[11] and MATLAB.

Conclusion

Thanks to its robustness, Derivative-Free Optimization (DFO) has emerged as a useful method of solving complex optimization problems where traditional methods that require derivatives to be available are not practical. This method of optimization is limited by the high computation cost, sensitivity to noise that leads to imperfect results, and the lack of gradient information that leads to slower convergence. However, advancements in computational power and algorithmic design are helping to mitigate these challenges, making DFO methods more efficient and accessible. There are numerous applications of DFO including in the oil, architecture, and medical industries where solving optimization problems has implications on global and local economies, the environment, and people’s health.

These examples demonstrate the usefulness that computational optimization has on humanity and its importance in our continuingly complex modern world. As the world continues to grow, environments continue to change, and better data becomes available, the methods of DFO will continue to grow with the world while helping humanity solve many of its most important challenges.

References

  1. 1.0 1.1 1.2 1.3 Audet, C., & Hare, W. (2017). Derivative-Free and Blackbox Optimization. Springer.
  2. 2.0 2.1 Conn, A. R., Scheinberg, K., & Vicente, L. N. (2009). Introduction to derivative-free optimization (pp. 1–12). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9780898718768.ch1
  3. IEA (2021), World Energy Balances: Overview, IEA, Paris https://www.iea.org/reports/world-energy-balances-overview, Licence: CC BY 4.0
  4. 4.0 4.1 Ciaurri, D.E., Mukerji, T., Durlofsky, L.J. (2011). Derivative-Free Optimization for Oil Field Operations. In: Yang, XS., Koziel, S. (eds) Computational Optimization and Applications in Engineering and Industry. Studies in Computational Intelligence, vol 359. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20986-4_2
  5. 5.0 5.1 5.2 Hooke, R., Jeeves, T.A.: Direct search solution of numerical and statistical problems. Journal of the ACM 8(2), 212–229 (1961)
  6. Audet, Charles & Dennis, J.. (2000). Analysis of Generalized Pattern Searches. SIAM Journal on Optimization. 13. 10.1137/S1052623400378742.
  7. U.S. Geological Survey (USGS). (n.d.). New USGS-FEMA study highlights economic earthquake risk in the United States. April 18, 2023, from https://www.usgs.gov/news/national-news-release/new-usgs-fema-study-highlights-economic-earthquake-risk-united-states
  8. NASA Spinoff. (2015). Shock absorbers save structures and lives during earthquakes. Retrieved from https://spinoff.nasa.gov/Spinoff2015/ps_2.html
  9. Mario Ortega, Diego Lopez-Garcia, Gaston Fermandois, Optimal properties of hysteretic dampers to minimize peak floor accelerations in multi-story buildings subjected to earthquakes, Structures, Volume 70, 2024,107862, ISSN 2352-0124, https://doi.org/10.1016/j.istruc.2024.107862.(https://www.sciencedirect.com/science/article/pii/S2352012424020150)
  10. Berenbaum MC. Direct search methods in the optimisation of cancer chemotherapy regimens. Br J Cancer. 1990 Jan;61(1):101-9. doi: 10.1038/bjc.1990.22. Erratum in: Br J Cancer 1990 May;61(5):788. PMID: 2297481; PMCID: PMC1971319. https://pmc.ncbi.nlm.nih.gov/articles/PMC1971319/
  11. Virtanen, Pauli, et al. SciPy: Open Source Scientific Tools for Python. SciPy, 2020, https://docs.scipy.org/doc/scipy/reference/optimize.minimize-neldermead.html. Accessed 15 Dec. 2024.