# Interior-point method for NLP

Author: Apoorv Lal (CHEME 6800, Fall 2021)

## Introduction

Interior point (IP) methods are used to solve all kind of problems from linear to non-linear, from convex to non convex. The first known IP method is Frisch's (1955) logarithmic barrier method that was later extensively studied by Fiacco and McCormick [1]. However, these methods mainly emerged in the late 1970's and 1980s. They were developed as algorithms to solve linear programming(LP) problems with a better worst case complexity than the simplex method [2]. In 1984, Karmarkar used IP methods to develop 'A new polynomial-Time method for Linear Programming' [3]. The main advantage of this method was better practical performance than the earlier algorithms like the ellipsoid method by Khachiyan [4]. After Karmarkar's publication more intensive research in this field led to development of a new category of IP methods known as 'primal-dual' methods. Currently, most powerful type of primal-dual codes are Ipopt, KNITRO and LOQO [5]. In recent years the majority of research in the field of IPs is for nonlinear optimisation problems, especially for second order (referred as SOCO) and semidefinite optimisation ( referred as SDO). SDO has wide range of applications in combinatorial optimization, control, structural engineering and electrical engineering [6].

## Theory & Methodology

### Nonlinear Constrained Optimisation

General form for any nonlinear constrained optimization problem can be depicted as shown in Equation set 1:

${\displaystyle minf(x)}$

${\displaystyle s.t.c(x)=0}$

${\displaystyle x\geqslant 0}$

• The objective function and constraints can be any linear, quadratic or any general nonlinear functions.
• Typically, applied for convex optimization problems
• Convert maximisation problems to minimisation problems by taking negative of the objective function
• Convert any general inequalities to simple inequalities using slack variables.

### Slack variables

Consider a general equation of the form as shown in Equation set 2:

${\displaystyle minf(x)}$

${\displaystyle s.t.g(x)\geqslant b}$

${\displaystyle h(x)=0}$

${\displaystyle x\geqslant 0}$

Now we need to convert this form to the form as shown in Equation set 1. For this the inequality '${\displaystyle g(x)\geqslant b}$ ' can be written ${\displaystyle g(x)-b\geq 0}$ which can be further written as:

${\displaystyle g(x)-b-s=0}$ , along with ${\displaystyle s\geq 0}$

Thus the set of equation '${\displaystyle g(x)-b-s=0}$' and '${\displaystyle h(x)=0}$' become '${\displaystyle c(x)=0}$' as shown in Equation set 1.

### Barrier function

Once the given problem is converted to the standard form (as shown in Equation set 1), we will now try to remove the inequality constraint ${\displaystyle x\geqslant 0}$ using the barrier term. We will introduce natural log barrier term for inequality constraints as shown in Equation set 3:

${\displaystyle \min f(x)-\mu \textstyle \sum _{i}\displaystyle ln(x_{i})}$

${\displaystyle s.t.c(x)=0}$

The point ${\displaystyle \displaystyle ln(x_{i})}$ is not defined for ${\displaystyle x_{i}\leqslant 0}$ . Our goal is to solve this barrier problem for a given ${\displaystyle '\mu \textstyle '}$, then decrease this ${\displaystyle '\mu \textstyle '}$ and resolve the problem. If ${\displaystyle '\mu \textstyle '}$ is decreased at the right rate and the approximate solutions are good enough then this method will converge to an optimal solution of the non linear optimization problem. We want to only search in the interior feasible space.

#### Example for barrier function:

Following example shows how barrier term can be incorporated into the objective function:

${\displaystyle min(x-4)^{2}-\mu \textstyle \displaystyle ln(x)}$

For different values of ${\displaystyle '\mu \textstyle '}$ , the value of objective function changes and thus as the value of ${\displaystyle '\mu \textstyle '}$ is decreased from '10' to '1' (as shown in Figure 1) , we approach towards the optimal solution i.e. ${\displaystyle x=4}$.

Figure 1: Variation of given objective function for different values of 'μ'

After we know the basics described above, the following methodology adopted from [7] can be used to solve the non-linear optimization problem.

### KKT Condition for Barrier Problem

KKT conditions for the barrier problem of the form:

${\displaystyle \min f(x)-\mu \textstyle \sum _{i=1}^{n}\displaystyle ln(x_{i})}$ ${\displaystyle \Rightarrow }$ ${\displaystyle \nabla f(x)+\nabla c(x)\lambda -\mu \textstyle \sum _{i=1}^{n}\displaystyle {\tfrac {1}{x_{i}}}}$

${\displaystyle s.t.c(x)=0}$ ${\displaystyle c(x)=0}$

We can define now define ${\displaystyle z_{i}={\tfrac {\mu }{x_{i}}}}$ and solve the modified version of the KKT conditions:

${\displaystyle \nabla f(x)+\nabla c(x)\lambda -z=0}$

${\displaystyle c(x)=0}$

${\displaystyle XZe-\mu e=0}$ .

The matrix ${\displaystyle 'e'}$ is simply a column vector of ones i.e. ${\displaystyle e={\begin{bmatrix}1\\1\\1\\1\end{bmatrix}}}$ to satisfy the rules of matrix operations.

### KKT solution with Newton-Raphson method

${\displaystyle \nabla f(x)+\nabla c(x)\lambda -z=0}$

${\displaystyle c(x)=0}$

${\displaystyle XZe-\mu e=0}$

${\displaystyle \Rightarrow }$ ${\displaystyle {\begin{bmatrix}W_{k}&\nabla c(x_{k})&-I\\\nabla c(x_{k})^{T}&0&0\\Z_{k}&0&X_{k}\end{bmatrix}}}$${\displaystyle {\begin{pmatrix}d_{k}^{x}\\d_{k}^{\lambda }\\d_{k}^{z}\end{pmatrix}}=}$${\displaystyle -{\begin{pmatrix}\nabla f(x_{k})+\nabla c(x_{k})\lambda _{k}-z_{k}\\c(x_{k})\\X_{k}Z_{k}e-\mu _{j}e\end{pmatrix}}}$

Here, ${\displaystyle d_{k}^{z}}$ , ${\displaystyle d_{k}^{x}}$ and ${\displaystyle d_{k}^{\lambda }}$ are the search directions for the ${\displaystyle x,z,\lambda }$ and ${\displaystyle W_{k}}$ is the second gradient of the Lagrangian.

Thus, ${\displaystyle W_{k}={\nabla _{xx}}^{2}L(x_{k},\lambda _{k},z_{k})={\nabla _{xx}}^{2}(f(x_{k})+c(x_{k})^{T}\lambda _{k}-z_{k})}$ and ${\displaystyle Z_{k}={\begin{bmatrix}z_{1}&0&0\\0&\centerdot \centerdot \centerdot &0\\0&0&z_{n}\end{bmatrix}}}$ , ${\displaystyle X_{k}={\begin{bmatrix}x_{1}&0&0\\0&\centerdot \centerdot \centerdot &0\\0&0&x_{n}\end{bmatrix}}}$

The above system can be arranged into symmetric linear system as follows:

${\displaystyle {\begin{bmatrix}W_{k}+\Sigma _{k}&\nabla c(x_{k})\\\nabla c(x_{k})^{T}&0\end{bmatrix}}{\begin{pmatrix}d_{k}^{x}\\d_{k}^{\lambda }\end{pmatrix}}}$ ${\displaystyle =-{\begin{pmatrix}\nabla f(x_{k})+\nabla c(x_{k})\lambda _{k}\\c(x_{k})\end{pmatrix}}}$ where ${\displaystyle \Sigma _{k}=X_{k}^{-1}Z_{k}}$ ${\displaystyle ........Equation(*)}$

We can then solve for ${\displaystyle d_{k}^{z}}$ after finding the linear solution to ${\displaystyle d_{k}^{x}}$ and ${\displaystyle d_{k}^{\lambda }}$ with the following form of the explicit solution: ${\displaystyle d_{k}^{z}=\mu _{k}X_{k}^{-1}e-z_{k}-\Sigma _{k}d_{k}^{x}}$ ${\displaystyle ........Equation(**)}$

### Step size ${\displaystyle (\alpha )}$

After we have found out the search directions we need to find an appropriate value of the step size. The aim here is to find a value of step size to minimise the value of the objective functions and minimise the constraint violations. Following two methods are generally used to determine the value of ${\displaystyle (\alpha )}$:

• Decreasing the merit function: Merit function is defined as the sum of the objective function and the absolute values of the constraint violations multiplied by a scalar quantity i.e. ${\displaystyle merit=f(x)+v\sum \left\vert c(x)\right\vert }$ so that it determines if it's a better step or not.
• Filter methods use a combination of objective function ${\displaystyle f(x)}$ and constraint violations ${\displaystyle \left\vert c(x)\right\vert }$ to determine whether to accept the step size ${\displaystyle (\alpha )}$ or not.

Once the step size has been determined we can find out the values of ${\displaystyle x_{k+1},z_{k+1},\lambda _{k+1}}$

${\displaystyle x_{k+1}=x_{k}+\alpha d_{k}^{x}}$

${\displaystyle z_{k+1}=z_{k}+\alpha d_{k}^{z}}$

${\displaystyle \lambda _{k+1}=\lambda _{k}+\alpha d_{k}^{\lambda }}$

### Convergence Criteria

Convergence criteria can be defined when KKT conditions are satisfied within a tolerance:

${\displaystyle \max \left\vert \nabla f(x)+\nabla c(x)\lambda -z\right\vert \leqslant \epsilon _{tol}}$

${\displaystyle \max \left\vert c(x)\right\vert \leq \epsilon _{tol}}$

${\displaystyle \max \left\vert XZe-\mu e\right\vert \leq \epsilon _{tol}}$

### Pseudocode

The above described algorithm can be depicted in the form of the following pseudocode as illustrated by [8]

• Choose a feasible guess value of ${\displaystyle x_{o}}$ and initialise ${\displaystyle \lambda }$ and ${\displaystyle z_{o}}$ by solving the above described equation.
• Once we have initial values then check for convergence, if it is within the 'tolerance limit' then we have reached the optimal solution. If not then, move to next step.
• Compute the search direction using the Equation(*) and Equation(**) as mentioned above.
• Perform back tracking line search by progressively decreasing the value of ${\displaystyle \alpha }$ using merit function of Filter methods.
• Check for convergence again and repeat steps if the solution is not within tolerance limit.
Figure 2: Pseudocode for Interior point method for NLP

### Barrier method vs Primal-dual method

In order to understand the concept of central path and how the barrier method is different from primal-dual method, the following methodology adopted from [9] can be used:

Consider the simple case of an LP with the following form:

Primal problem:

${\displaystyle minc^{T}x}$

${\displaystyle s.t.Ax\leqslant b}$

Dual problem:

${\displaystyle max-b^{T}z}$

${\displaystyle s.t.A^{T}z+c=0}$

${\displaystyle z\geqslant 0}$

Assuming primal and dual problems are strictly feasible and ${\displaystyle rank(A)=n}$

Central path : Set of points with ${\displaystyle {x^{*}(t)|t>0}}$ with

${\displaystyle x^{*}(t)=argmin(tc^{T}x+\phi (x))=argmin(tc^{T}x-\sum _{i=1}^{m}log(b_{i}-a_{i}^{T}x))}$

${\displaystyle x^{*}(t)}$ is unique for all ${\displaystyle t>0}$

The point ${\displaystyle x^{*}(t)}$ is strictly primal feasible and satisfies:

${\displaystyle c+\sum _{i=1}^{m}z_{i}^{*}(t)a_{i}=0}$ with ${\displaystyle z_{i}^{*}(t)=\displaystyle {\tfrac {1}{t(b_{i}-a_{i}^{T}x^{*}(t))}}}$

${\displaystyle z_{i}^{*}(t)}$ is also strictly dual feasible: ${\displaystyle A^{T}z^{*}(t)+c=0}$ and ${\displaystyle z^{*}(t)>0}$

Duality gap between ${\displaystyle z^{*}(t)}$ and ${\displaystyle x^{*}(t)}$ can be calculated as:

${\displaystyle c^{T}x+b^{T}z=(b-Ax)^{T}z={\tfrac {m}{t}}}$

The bound on the sub-optimality of ${\displaystyle x^{*}(t)}$ :

${\displaystyle c^{T}x^{*}(t)-p^{*}\leqslant {\tfrac {m}{t}}}$ , where ${\displaystyle p^{*}}$ is the optimal value of the LP.

The following overview on Barrier vs primal-dual method was given by [10]

• Both methods are motivated in terms of perturbed KKT conditions.
• Primal dual interior point methods take one Newton step and then move to next step i.e. there are no separate inner and outer loops.
• Primal dual interior point iterates are not necessarily feasible.
• Primal-dual interior-point methods are often more efficient, as they can exhibit better than linear convergence.

## Numerical Example

Consider the following non-linear optimisation problem:

${\displaystyle \min x_{2}(5+x_{1})}$

${\displaystyle s.t.x_{1}x_{2}\geq 5}$

${\displaystyle x_{1}^{2}+{x_{2}}^{2}\leq 20}$

This problem needs to be converted into standard form first:

${\displaystyle \min x_{2}(5+x_{1})}$

${\displaystyle s.t.x_{1}x_{2}-5-x_{3}=0\Rightarrow c_{1}(x)}$

${\displaystyle 20-x_{1}^{2}-{x_{2}}^{2}-x4=0\Rightarrow c_{2}(x)}$

${\displaystyle x_{3}\geq 0}$

${\displaystyle x_{4}\geq 0}$

The initial guess values can be selected as ${\displaystyle x^{o}=}$ ${\displaystyle {\begin{bmatrix}x_{1}\\x_{2}\\x_{3}\\x_{4}\end{bmatrix}}={\begin{bmatrix}0\\0\\0\\0\end{bmatrix}}}$

We will also choose the initial guess values of ${\displaystyle \lambda _{1},\lambda _{2},z_{1},z_{2}}$ as ${\displaystyle '0'}$.

${\displaystyle c(x^{o})=}$ ${\displaystyle {\begin{bmatrix}-5\\20\end{bmatrix}}}$

${\displaystyle \nabla _{x}c={\begin{bmatrix}x_{2}&-2x_{1}\\x_{1}&-2x_{2}\\-1&0\\0&-1\end{bmatrix}}={\begin{bmatrix}0&0\\0&0\\-1&0\\0&-1\end{bmatrix}}}$, The first column represents ${\displaystyle \nabla _{x}c_{1}}$ and second column represents ${\displaystyle \nabla _{x}c_{2}}$ .

Now we will calculate the values of ${\displaystyle \nabla _{xx}c_{1}}$ and ${\displaystyle \nabla _{xx}c_{2}}$ at the initial guess values.

${\displaystyle \nabla _{xx}c_{1}={\begin{bmatrix}0&1&0&0\\1&0&0&0\\0&0&0&0\\0&0&0&0\end{bmatrix}}}$ , ${\displaystyle \nabla _{xx}c_{2}={\begin{bmatrix}-2&0&0&0\\0&-2&0&0\\0&0&0&0\\0&0&0&0\end{bmatrix}}}$

Now, ${\displaystyle \nabla f(x)={\begin{bmatrix}x_{2}\\5+x_{1}\\0\\0\end{bmatrix}}={\begin{bmatrix}0\\5\\0\\0\end{bmatrix}}}$. Similarly, ${\displaystyle \nabla _{xx}f={\begin{bmatrix}0&1&0&0\\1&0&0&0\\0&0&0&0\\0&0&0&0\end{bmatrix}}}$

In the above equation Matrix ${\displaystyle 'Z'}$ and ${\displaystyle 'X'}$ are basically diagonal matrices with the diagonal elements as follows:

${\displaystyle Z={\begin{bmatrix}z_{1}&0&0&0\\0&z_{2}&0&0\\0&0&z_{3}&0\\0&0&0&z_{4}\end{bmatrix}}}$ and ${\displaystyle X={\begin{bmatrix}x_{1}&0&0&0\\0&x_{2}&0&0\\0&0&x_{3}&0\\0&0&0&x_{4}\end{bmatrix}}}$

As defined earlier, the matrix ${\displaystyle 'e'}$ is simply a column vector of ones i.e. ${\displaystyle e={\begin{bmatrix}1\\1\\1\\1\end{bmatrix}}}$

Now since ${\displaystyle W_{k}={\nabla _{xx}}^{2}L(x_{k},\lambda _{k},z_{k})={\nabla _{xx}}^{2}(f(x_{k})+c(x_{k})^{T}\lambda _{k}-z_{k})}$ and we have chosen initial guess values of ${\displaystyle \lambda _{1},\lambda _{2},z_{1},z_{2}as'0'}$ , thus ${\displaystyle W_{k}=\nabla _{xx}f}$.

Now we have everything to substitute in the following equation and calculate the search directions and iterate till we fulfil the convergence criteria:

${\displaystyle {\begin{bmatrix}W_{k}&\nabla c(x_{k})&-I\\\nabla c(x_{k})^{T}&0&0\\Z_{k}&0&X_{k}\end{bmatrix}}}$${\displaystyle {\begin{pmatrix}d_{k}^{x}\\d_{k}^{\lambda }\\d_{k}^{z}\end{pmatrix}}=}$${\displaystyle -{\begin{pmatrix}\nabla f(x_{k})+\nabla c(x_{k})\lambda _{k}-z_{k}\\c(x_{k})\\X_{k}Z_{k}e-\mu _{j}e\end{pmatrix}}}$

We can now implement BPOPT solver using MATLAB to get the following results as illustrated by [11]:

Figure 3: Variation of x1 and x2 as iterations progress [11]

From figure 3 we see that the solver has converged to following optimal values of x1 and x2 in '13' iterations. ${\displaystyle x1=4.3198}$ and ${\displaystyle x2=1.1575}$

Figure 4: Variation of 'μ' as iterations progress [11]

In the figure 4 we can clearly see how the value of ${\displaystyle '\mu '}$ is decreasing as we progress towards the solution (increasing iterations).

## Applications

IP methods have proven computationally to be viable alternative for the solution of several power engineering optimisation models. A variety of these are applied to a number of power systems models. 'Optimal power flow (OPF)' seeks the operation point of a power system that maximizes a given objective function and satisfies operational and physical constraints. Given real-time operating conditions and the large scale of the power system, it is demanding to develop algorithms that allow for OPF to be decomposed and efficiently solved on parallel computing systems [12]. As compared to gradient based algorithms primal-dual based interior point methods have faster convergence rates.

A hydro-thermal coordination program deals with the problem of finding the scheduling and the production of every hydro and thermal plant of a system so that the customer demand is supplied at minimum cost with a certain level of security and all constraints related to the thermal and hydro subsystems are satisfied. A medium-term hydro-thermal coordination (MTHTC) problem results in a very large Non-Linear Mixed-Integer Programming problem [13] IP optimisation codes are used to find the solution of these MTHTC problems.

IP methods are also used in Voltage collapse and reliability evaluation. Power flow unsolvability occurs when for a given set of active and reactive power loads the power flow equations have no real solution. This kind of problem may happen for instance when a heavily stressed system is subject to a severe contingency situation. The set of control actions in the IP algorithm includes rescheduling of active power of generators, adjustments on terminal voltage of generators, tap changes on LTC transformers, and as a last resort, minimum load shedding [14].

Fuel Planning and Multi-reservoir management are some other areas where interior point methods have proven to be useful. A long-term fuel planning problem aims to minimize the sum of fuel purchase and yearly storage costs, subject to meeting the generation requirements, the constraints associatedwith fuel supply contracts, and the constraints associated with storage of fuel.[15]

## Conclusion

Interior Point methods remain an active and fruitful area of research, although the frenetic pace that has characterized the area slowed down in recent years. Linear programming codes have become mainstream and continue to undergo development although they face stiff competition from the simplex method. Applications to quadratic programming show considerable promise, because of the superior ability of interior point approach to exploit problem structure efficiently. The influence on nonlinear programming theory and practice is yet to be determined even though substantial research has already been done on this topic [16]. An extensive evaluation of line search options was conducted by [17] and it was concluded that filter approach was the most robust approach. Many researchers are hopeful for the use of interior point approach in decomposition methods, though no rigorous comparative studies with alternative approaches have been performed.[18] Many authors have also worked on a special subclass of non smooth optimization problems and a new interior point method has been proposed for this class. These methods extend a standard interior-point method for nonlinear and nonconvex problems by additional line-search and step length computation features and a problem-tailored stationarity test. A special type of generalized gradients is developed that is used at points of the nonsmooth problems that fail to be differentiable.[19]

## References

1. Mevludin Glavic, Louis Wehenkel. Interior Point Methods: A Survey, Short Survey of Applications to Power Systems, and Research Opportunities. (2004)
2. Martin Scmidt. An Interior Point Method for Nonlinear Optimization problems with locatable and separable non smoothness.
3. Narendra Karmarkar. “A New Polynomial-Time Algorithm for Linear Programming.” In: Combinatorica 4.4 (1984), pp. 373–395. issn: 0209-9683. doi:10.1007/BF02579150.
4. Leonid Genrikhovich Khachiyan. “A polynomial algorithm in linear program- ming.” In: Soviet Mathematics Doklady 20 (1979), pp. 191–194.
5. Martin Scmidt. An Interior Point Method for Nonlinear Optimization problems with locatable and separable non smoothness.
6. Bomze, Immanuel M., et al.Nonlinear Optimization, Springer-Verlag Berlin Heidelberg, 2010.
7. A. Wächter and L. T. Biegler, On the Implementation of an Interior-Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming, Mathematical Programming 106(1), pp. 25-57, 2006.
8. Interior point method for Optimization <https://www.youtube.com/watch?v=zm4mfr-QT1E>
9. L.Vandenberghe."Lecture 13 The central path". <http://www.seas.ucla.edu/~vandenbe/ee236a/lectures/cpath.pdf>
10. Ryan Tibshirani."Primal-Dual Interior-Point methods <https://www.stat.cmu.edu/~ryantibs/convexopt/lectures/primal-dual.pdf>
11. Design Optimisation-Interior point methods, <https://apmonitor.com/me575/index.php/Main/InteriorPointMethod>
12. R. A. Jabr, A. H. Cooninck, B. J. Cory: ”A Primal-Dual Interior Point Method for Optimal Power Flow Dispatching”, IEEE Transactions on Power Systems, Vol. 17, No. 3, pp. 654-662, 2002.
13. J. Medina, V. H. Quintana, A. J. Conejo, F. P. Thoden: “A Comparison of Interior-Point Codes for Medium-Term Hydro-Thermal Coordination”, IEEE PES Summer Meeting, Paper 0-7803-1/97, 1997.
14. S. Grenville, J. C. O. Mello, A. C. G. Mello: “Application of interior point methods to power flow unsolvability”, IEEE Transactions on Power Systems, Vol. 11, pp. 1096-1103, 1996.
15. V. R. Sherkat, Y. Ikura: “Experience with Interior Point Optimisation Software for a Fuel Planning Application” IEEE Transactions on Power Systems,Vol. 9, No. 2, 1994.
16. Stephen J. Wright. "Recent developments in Interior point methods."
17. A. Wächter and L. T. Biegler, On the Implementation of an Interior-Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming, Mathematical Programming 106(1), pp. 25-57, 2006.
18. Stephen J. Wright. "Recent developments in Interior point methods."
19. Martin Scmidt. An Interior Point Method for Nonlinear Optimization problems with locatable and separable non smoothness.