Quasi-Newton methods

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Jump to navigation Jump to search

Author: Jianmin Su (ChemE 6800 Fall 2020)

Quasi-Newton Methods are a kind of methods used to solve nonlinear optimization problems. They are based on Newton's method yet can be an alternative to Newton's method when the objective function is not twice-differentiable, which means the Hessian matrix is unavailable, or it is too expensive to calculate the Hessian matrix and its inverse.

Introduction

The first quasi-Newton algorithm was developed by W.C. Davidon in the mid1950s and it turned out to be a milestone in nonlinear optimization problems. He was trying to solve a long optimization calculation but he failed to get the result with the original method due to the low performance of computers. Thus he managed to build the quasi-Newton method to solve it. Later, Fletcher and Powell proved that the new algorithm was more efficient and more reliable than the other existing methods.

During the following years, numerous variants were proposed, include Broyden's method (1965), the SR1 formula (Davidon 1959, Broyden 1967), the DFP method (Davidon, 1959; Fletcher and Powell, 1963), and the BFGS method (Broyden, 1969; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970)[1].

In optimization problems, Newton's method uses first and second derivatives, gradient and the Hessian in multivariate scenarios, to find the optimal point, it is applied to a twice-differentiable function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(x)} to find the roots of the first derivative (solutions to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f'(x)=0} ), also known as the stationary points of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f(x)} [2].

The iteration of Newton's method is usually written as: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}=x_k-H^{-1}\cdot\bigtriangledown f(x_k) } , where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k } is the iteration number, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H} is the Hessian matrix and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H=[\bigtriangledown ^2 f(x_k)]}

Iteraton would stop when it satisfies the convergence criteria like Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {df \over dx}=0, ||\bigtriangledown f(x)||<\epsilon \text{ or } |f(x_{k+1})-f(x_k)|<\epsilon }

Though we can solve an optimization problem quickly with Newton's method, it has two obvious disadvantages:

  1. The objective function must be twice-differentiable, and the Hessian matrix must be positive definite.
  2. The calculation is costly because it requires to compute the Jacobian matrix, Hessian matrix, and its inverse, which is time-consuming when dealing with a large-scale optimization problem.

However, we can use Quasi-Newton methods to avoid these two disadvantages.

Quasi-Newton methods are similar to Newton's method, but with one key idea that is different, they don't calculate the Hessian matrix. They introduce a matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B} to estimate the Hessian matrix instead so that they can avoid the time-consuming calculations of the Hessian matrix and its inverse. And there are many variants of quasi-Newton methods that simply depend on the exact methods they use to estimate the Hessian matrix.

Theory and Algorithm

To illustrate the basic idea behind quasi-Newton methods, we start with building a quadratic model of the objective function at the current iterate Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_k} :

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_k(p)=f_k(p)+\bigtriangledown f_k^T(p)+\frac{1}{2}p^TB_kp} (1.1),

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_k } is an Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\times n } symmetric positive definite matrix that will be updated at every iteration.

The minimizer of this convex quadratic model is:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_k=-B_k^{-1}\bigtriangledown f_k } (1.2),

which is also used as the search direction.

Then the new iterate could be written as: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}=x_{k}+\alpha _kp_k} (1.3),

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha _k } is the step length that should satisfy the Wolfe conditions. The iteration is similar to Newton's method, but we use the approximate Hessian Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k}} instead of the true Hessian.

To maintain the curve information we got from the previous iteration in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} , we generate a new iterate Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}} and new quadratic modelto in the form of:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_{k+1}(p)=f_{k+1}+\bigtriangledown f_{k+1}^Tp+\frac{1}{2}p^TB_{k+1}p} (1.4).

To construct the relationship between 1.1 and 1.4, we require that in 1.1 at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p=0} the function value and gradient match Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_k} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bigtriangledown f_k} , and the gradient of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m_{k+1}} should match the gradient of the objective function at the latest two iterates Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_k} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}} , then we can get:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bigtriangledown m_{k+1}(-\alpha _kp_k)=\bigtriangledown f_{k+1}-\alpha _kB_{k+1}p_k=\bigtriangledown f_k } (1.5)

and with some arrangements:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}\alpha _k p_k=\bigtriangledown f_{k+1}-\bigtriangledown f_k } (1.6)

Define:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_k=x_{k+1}-x_k} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_k=\bigtriangledown f_{k+1}-\bigtriangledown f_k} (1.7)

So that (1.6) becomes: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}s_k=y_k } (1.8), which is the secant equation.

To make sure Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} is still a symmetric positive definite matrix, we need Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_k^Ts_k>0} (1.9).

To further preserve properties of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} and determine Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} uniquely, we assume that among all symmetric matrices satisfying secant equation, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} is closest to the current matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_k} , which leads to a minimization problem:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}=\underset{B}{min}||B-B_k|| } (1.10) s.t. Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B=B^T} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Bs_k=y_k} ,

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_k} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_k} satisfy (1.9) and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_k} is symmetric and positive definite.

Different matrix norms applied in (1.10) results in different quasi-Newton methods. The weighted Frobenius norm can help us get an easy solution to the minimization problem: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle ||A||_W=||W^\frac{1}{2}AW^\frac{1}{2}|| _F} (1.11).

The weighted matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle W} can be any matrix that satisfies the relation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Wy_k=s_k} .

We skip procedures of solving the minimization problem (1.10) and here is the unique solution of (1.10):

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}=(I-\rho y_ks_k^T)B_k(I-\rho s_ky_k^T)+\rho y_ky_k^T} (1.12)

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \rho=\frac{1}{y_k^Ts_k}} (1.13)

Finally, we get the updated Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} . However, according to (1.2) and (1.3), we also need the inverse of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} in next iterate.

To get the inverse of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} , we can apply the Sherman-Morrison formula to avoid complicated calculation of inverse.

Set Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle M_k=B_k^{-1} } , with Sherman-Morrison formula we can get:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle M_{k+1}=M_k+\frac{s_k s_k^T}{s_k^T y_k}-\frac{M_k y_k y_k^T M_k}{y_k^T M_k y_k} } (1.14)

With the derivation[3] above, we can now understand how do quasi-Newton methods get rid of calculating the Hessian matrix and its inverse. We can directly estimate the inverse of Hessian, and we can use (1.14) to update the approximation of the inverse of Hessian, which leads to the DFP method, or we can directly estimate the Hessian matrix, and this is the main idea in the BFGS method.


DFP method

The DFP method, which is also known as the Davidon–Fletcher–Powell formula, is named after W.C. Davidon, Roger Fletcher, and Michael J.D. Powell. It was proposed by Davidon in 1959 first and then improved by Fletched and Powell. DFP method uses an Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\times n } symmetric positive definite matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_k } to estimate the inverse of Hessian matrix and its algorithm is shown below[4].

DFP Algorithm

To avoid confusion, we use Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D} to represent the approximation of the inverse of the Hessian matrix.

  1. Given the starting point Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_0} ; convergence tolerance Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon, \epsilon>0} ; the initial estimation of inverse Hessian matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D_0=I} ; Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=0} .
  2. Compute the search direction Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d_k=-D_k\cdot \bigtriangledown f_k} .
  3. Compute the step length Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda_k} with a line search procedure that satisfies Wolfe conditions. And then set
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_k={\lambda}_k d_k} ,
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}=x_k+s_k}
  4. If Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle ||\bigtriangledown f_{k+1}||<\epsilon} , then end of the iteration, otherwise continue step5.
  5. Computing Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_k=g_{k+1}-g_k} .
  6. Update the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D_{k+1}} with
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D_{k+1}=D_k+\frac{s_k s_k^T}{s_k^T y_k}-\frac{D_k y_k y_k^T D_k}{y_k^T D_k y_k} }
  7. Update Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k} with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=k+1} and go back to step2.


BFGS method

BFGS method is named for its four discoverers Broyden, Fletcher, Goldfarb, and Shanno. It is considered the most effective quasi-Newton algorithm. Unlike the DFP method, the BFGS method uses an Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\times n } symmetric positive definite matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_k } to estimate the Hessian matrix[5].

BFGS Algorithm

  1. Given the starting point Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_0} ; convergence tolerance Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon, \epsilon>0} ; the initial estimation of Hessian matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_0=I} ; Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=0} .
  2. Compute the search direction Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d_k=-B_k^{-1}\cdot \bigtriangledown f_k} .
  3. Compute the step length Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda_k} with a line search procedure that satisfies Wolfe conditions. And then set
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_k={\lambda}_k d_k} ,
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_{k+1}=x_k+s_k}
  4. If Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle ||\bigtriangledown f_{k+1}||<\epsilon} , then end of the iteration, otherwise continue step5.
  5. Computing Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_k=\bigtriangledown f_{k+1}-\bigtriangledown f_k} .
  6. UpdateFailed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}} with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}=B_k+\frac{y_k y_k^T}{y_k^T s_k}-\frac{B_k s_k s_k^T B_k}{s_k^T B_k s_k} }
    Since we need to update Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}^{-1}} , we can apply the Sherman-Morrison formula to avoid complicated calculation of inverse.
    With Sherman-Morrison formula, we can update Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}^{-1}} with
    Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_{k+1}^{-1}=(I-\rho s_ky_k^T)B_k^{-1}(I-\rho y_ks_k^T)+\rho s_ks_k^T} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \rho=\frac{1}{y_k^Ts_k}}
  7. Update Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k} with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=k+1} and go back to step2.

Numerical Example

The following is an example to show how to solve an unconstrained nonlinear optimization problem with the DFP method.

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \text{min }\begin{align} f(x_1, x_2) & = x_1^2 +\frac{1}{2}x_2^2+3\end{align}}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_0=(1,2)^T }

Step 1:

Usually, we set the approximation of the inverse of the Hessian matrix as an identity matrix with the same dimension as the Hessian matrix. In this case, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_0} is a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 2\times2} identity matrix.

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_0} : Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bigtriangledown f_x} : Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{pmatrix} 2x_1 \\ x_2 \end{pmatrix}}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon=10^{-5}}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=0}

For convenience, we can set .

Step 2:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d_0=-B_0^{-1}\bigtriangledown f_0} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =-\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{pmatrix} 2 \\ 2 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} -2 \\ -2 \end{pmatrix}}

Step 3:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_0=d_0}

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_1=x_0+s_0} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} 1 \\ 2 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle +\begin{pmatrix} -2 \\ -2 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} -1 \\ 0 \end{pmatrix}}

Step 4:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bigtriangledown f_0} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} -2 \\ 0 \end{pmatrix}}

Since Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |\bigtriangledown f_0|} is not less than Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \epsilon} , we need to continue.

Step 5:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_0=\bigtriangledown f_1-\bigtriangledown f_0} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} -4 \\ -2 \end{pmatrix}}

Step 6: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_1=B_0+\frac{s_0 s_0^T}{s_0^T y_0}-\frac{D_0 y_0 y_0^T D_0}{y_0^T D_0 y_0} } Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle +\frac{1}{12}\begin{pmatrix} 4 & 4 \\ 4 & 4 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle -\frac{1}{20}\begin{pmatrix} 16 & 8 \\ 8 & 4 \end{pmatrix}} Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle =\begin{pmatrix} 0.53333 & -0.0667 \\ -0.0667 & 1.1333 \end{pmatrix}}

And then go back to Step 2 with the update Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_1} to start a new iterate until Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle |\bigtriangledown f_k|<\epsilon} .

We continue the rest of the steps in python and the results are listed below:

Iteration times: 0 Result:[-1. 0.]

Iteration times: 1 Result:[ 0.06666667 -0.13333333]

Iteration times: 2 Result:[0.00083175 0.01330805]

Iteration times: 3 Result:[-0.00018037 -0.00016196]

Iteration times: 4 Result:[ 3.74e-06 -5.60e-07]

After four times of iteration, we finally get the optimal solution, which can be assumed as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_1=0, x_2=0} and the minimum of the objective function is 3.

As we can see from the calculation in Step 6, though the updated formula for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_1} looks complicated, it's actually not. We can see results of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_0^T y_0} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y_0^T D_0 y_0} are constant numbers and results of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_0 s_0^T} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D_0 y_0 y_0^T D_0} are matrix that with the same dimension as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle B_1} . Therefore, the calculation of quasi-Newton methods is faster and simpler since it's related to some basic matrix calculations like inner product and outer product.

Application

Quasi-newton methods are applied to various areas such as physics, biology, engineering, geophysics, chemistry, and industry to solve the nonlinear systems of equations because of their faster calculation. The ICUM (Inverse Column-Updating Method), one type of quasi-Newton methods, is not only efficient in solving large scale sparse nonlinear systems but also perfumes well in not necessarily large-scale systems in real applications. It is used to solve the Two-pint ray tracing problem in geophysics. A two-point ray tracing problem consists of constructing a ray that joins two given points in the domain and it can be formulated as a nonlinear system. ICUM can also be applied to estimate the transmission coefficients for AIDS and for Tuberculosis in Biology, and in Multiple target 3D location airborne ultrasonic system. [6]

Moreover, they can be applied and developed into the Deep Learning area as sampled quasi-Newton methods to help make use of more reliable information.[7] The methods they proposed sample points randomly around the current iterate at each iteration to create Hessian or inverse Hessian approximations, which is different from the classical variants of quasi-Newton methods. As a result, the approximations constructed make use of more reliable (recent and local) information and do not depend on past iterate information that could be significantly stale. In their work, numerical tests on a toy classification problem and on popular benchmarking neural network training tasks show that the methods outperform their classical variants.

Besides, to make quasi-Newton methods more available, they are integrated into programming languages so that people can use them to solve nonlinear optimization problems conveniently, for example, Mathematic (quasi-Newton solvers), MATLAB (Optimization Toolbox), R, SciPy extension to Python.

Conclusion

Quasi-Newton methods are a milestone in solving nonlinear optimization problems, they are more efficient than Newton's method in large-scale optimization problems because they don't need to compute second derivatives, which makes calculation less costly. Because of their efficiency, they can be applied to different areas and remain appealing.

References

  1. Hennig, Philipp, and Martin Kiefel. "Quasi-Newton method: A new direction." Journal of Machine Learning Research 14.Mar (2013): 843-865.
  2. Newton’s Method, 8.Dec (2020)‎. Retrieved from: https://en.wikipedia.org/wiki/Quasi-Newton_method
  3. Nocedal, Jorge, and Stephen Wright. Numerical optimization. Springer Science & Business Media, 2006.
  4. Davidon–Fletcher–Powell formula, 7.June (2020). Retrieved from: https://en.wikipedia.org/wiki/Davidon%E2%80%93Fletcher%E2%80%93Powell_formula
  5. Broyden–Fletcher–Goldfarb–Shanno algorithm, 12.Dec (2020). Retrieved from: https://en.wikipedia.org/wiki/Broyden%E2%80%93Fletcher%E2%80%93Goldfarb%E2%80%93Shanno_algorithm
  6. Pérez, Rosana, and Véra Lucia Rocha Lopes. "Recent applications and numerical implementation of quasi-Newton methods for solving nonlinear systems of equations." Numerical Algorithms 35.2-4 (2004): 261-285.
  7. Berahas, Albert S., Majid Jahani, and Martin Takáč. "Quasi-newton methods for deep learning: Forget the past, just sample." arXiv preprint arXiv:1901.09997 (2019).