Interior-point method for LP: Difference between revisions

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 36: Line 36:
Consider a combination of primal-dual problem below:<br>
Consider a combination of primal-dual problem below:<br>
('''Primal Problem formulation''') <br>
('''Primal Problem formulation''') <br>
→ minimize <math> c^{T}x </math>  Subject to: <math> Ax = b </math>  and  <math> x \geq 0 </math> .......................................(1)<br>
→ minimize <math> c^{T}x </math>  Subject to: <math> Ax = b </math>  and  <math> x \geq 0 </math> ....................................................................................................(1)<br>
('''Dual Problem formulation''') <br>
('''Dual Problem formulation''') <br>
→ maximize <math> b^{T}y </math>  Subject to: <math> A^{T}y + \lambda  = c </math> and </math> \lambda \geq 0 </math> ...................(2)<br>
→ maximize <math> b^{T}y </math>  Subject to: <math> A^{T}y + \lambda  = c </math> and <nowiki></math></nowiki> \lambda \geq 0 <nowiki></math></nowiki> .................................................(2)<br>
'λ' vector introduced represents the slack variables.<br>
'λ' vector introduced represents the slack variables.<br>
Now we use the "Logarithmic Barrier" function and form 2 Lagrangian equations for primal and dual forms mentioned above:<br>
Now we use the "Logarithmic Barrier" function and form 2 Lagrangian equations for primal and dual forms mentioned above:<br>
Lagrange function for Primal : <math> L_{p}(x,y) = c^{T}\cdot x + \mu \cdot \sum_{j=1}^{n}log(x_{j}) - y^{T}\cdot (Ax - b) </math> ......................................(3)<br>
Lagrange function for Primal : <math> L_{p}(x,y) = c^{T}\cdot x + \mu \cdot \sum_{j=1}^{n}log(x_{j}) - y^{T}\cdot (Ax - b) </math> ........................................(3)<br>
Lagrange function for Dual  : <math> L_{d}(x,y,\lambda ) = b^{T}\cdot y + \mu \cdot \sum_{j=1}^{n}log(\lambda _{j}) - x^{T}\cdot (A^{T}y -\lambda - c) </math> .........(4)<br>
Lagrange function for Dual  : <math> L_{d}(x,y,\lambda ) = b^{T}\cdot y + \mu \cdot \sum_{j=1}^{n}log(\lambda _{j}) - x^{T}\cdot (A^{T}y -\lambda - c) </math> ..........................(4)<br>


Taking the partial derivatives of L<sub>p</sub> and L<sub>d</sub> with respect to variables 'x', 'λ' , 'y', and forcing these terms to zero, we get the following equations: <br>
Taking the partial derivatives of L<sub>p</sub> and L<sub>d</sub> with respect to variables 'x', 'λ' , 'y', and forcing these terms to zero, we get the following equations: <br>
<math> Ax = b </math>  and  <math> x \geq 0 </math> ..........................................(5)<br>
<math> Ax = b </math>  and  <math> x \geq 0 </math> .................................................................................................................................................(5)<br>
<math> A^{T}y + \lambda  = c </math> and <math> \lambda \geq 0 </math> .......................(6)<br>
<math> A^{T}y + \lambda  = c </math> and <math> \lambda \geq 0 </math> .........................................................................................................................................(6)<br>
<math> x_{j}\cdot \lambda _{j} = \mu </math> for ''j''= 1,2,.......''n'' .....................(7)<br>
<math> x_{j}\cdot \lambda _{j} = \mu </math> for ''j''= 1,2,.......''n'' ......................................................................................................................................(7)<br>


where, μ is strictly positive scaler parameter. For each μ > 0, the vectors in the set {x(μ), y(μ), λ(μ)} satisfying equations (5), (6) and (7), can we viewed as set of points in R<sup>n</sup>, R<sup>p</sup>, R<sup>n</sup> such that when <big>'μ'</big> varies, the corresponding points form a set of trajectories called ''"Central Path"''. The central path lies in the "Interior" of the feasible regions. There is a sample illustration of ''"Central Path"'' method in figure to right. We start with a positive value of <big>'μ'</big> and as <big>'μ'</big> goes to 0, we approach the optimal point <br>
where, μ is strictly positive scaler parameter. For each μ > 0, the vectors in the set {x(μ), y(μ), λ(μ)} satisfying equations (5), (6) and (7), can we viewed as set of points in R<sup>n</sup>, R<sup>p</sup>, R<sup>n</sup> such that when <big>'μ'</big> varies, the corresponding points form a set of trajectories called ''"Central Path"''. The central path lies in the "Interior" of the feasible regions. There is a sample illustration of ''"Central Path"'' method in figure to right. We start with a positive value of <big>'μ'</big> and as <big>'μ'</big> goes to 0, we approach the optimal point <br>
Line 60: Line 60:
=== Iterations using Newton's Method ===
=== Iterations using Newton's Method ===
Now we employ the Newton's iterative method to solve the following equations: <br>
Now we employ the Newton's iterative method to solve the following equations: <br>
<math> Ax - b = 0 </math>  ...................................................(8) <br>
<math> Ax - b = 0 </math>  ............................................................................................................................................................(8) <br>
<math> A^{T}y + \lambda  = c </math>  ........................................(9) <br>
<math> A^{T}y + \lambda  = c </math>  .........................................................................................................................................................(9) <br>
<math> X\cdot \lambda \cdot e  - \mu \cdot e  = 0</math> ..................... (10) <br>
<math> X\cdot \lambda \cdot e  - \mu \cdot e  = 0</math> ............................................................................................................................................ (10) <br>
Suppose we start with definition of starting point that lies within feasible region as (x<sup>0</sup>, y<sup>0</sup>, λ <sup>0</sup>) such that x<sup>0</sup>>0 and λ <sup>0</sup> >0. Also let us define 2 residual vectors for both the primal and dual equations: <br>
Suppose we start with definition of starting point that lies within feasible region as (x<sup>0</sup>, y<sup>0</sup>, λ <sup>0</sup>) such that x<sup>0</sup>>0 and λ <sup>0</sup> >0. Also let us define 2 residual vectors for both the primal and dual equations: <br>
<math> \delta _{p} = b - A\cdot x^{0} </math> .................................(11) <br>
<math> \delta _{p} = b - A\cdot x^{0} </math> .....................................................................................................................................................(11) <br>
<math> \delta _{d} = c - A^{0}\cdot y^{0} - \lambda ^{0} </math> ...............(12) <br>
<math> \delta _{d} = c - A^{0}\cdot y^{0} - \lambda ^{0} </math> ..........................................................................................................................................(12) <br>


Applying Newton's Method to solve equations (8) - (12) we get: <br>
Applying Newton's Method to solve equations (8) - (12) we get: <br>
Line 83: Line 83:
</math><br>
</math><br>
So a single iteration of Newton's method involves the following equations. For each iteration, we solve for the next value of x<sup>k+1</sup>, y<sup>k+1</sup> and λ<sup>k+1</sup>: <br>
So a single iteration of Newton's method involves the following equations. For each iteration, we solve for the next value of x<sup>k+1</sup>, y<sup>k+1</sup> and λ<sup>k+1</sup>: <br>
<math> (A\lambda ^{-1}XA^{T})\delta _{y} = b- \mu A\lambda^{-1} + A\lambda ^{-1}X\delta _{d} </math> ...................(13) <br>
<math> (A\lambda ^{-1}XA^{T})\delta _{y} = b- \mu A\lambda^{-1} + A\lambda ^{-1}X\delta _{d} </math> .....................................................................................................(13) <br>
<math> \delta _{\lambda} = \delta _{d}\cdot A^{T}\delta _{y} </math> ...................................................(14) <br>
<math> \delta _{\lambda} = \delta _{d}\cdot A^{T}\delta _{y} </math> ......................................................................................................................................................(14) <br>
<math> \delta _{x} = \lambda ^{-1}\left [ \mu \cdot e - X\lambda e -\lambda \delta _{z}\right ] </math> .................(15) <br>
<math> \delta _{x} = \lambda ^{-1}\left [ \mu \cdot e - X\lambda e -\lambda \delta _{z}\right ] </math> ............................................................................................................................(15) <br>
<math> \alpha _{p} = min\left \{ \frac{-x_{j}}{\delta _{x_{j}}} \right \} </math> for  <math> \delta x_{j} < 0 </math> ..... (16) <br>
<math> \alpha _{p} = min\left \{ \frac{-x_{j}}{\delta _{x_{j}}} \right \} </math> for  <math> \delta x_{j} < 0 </math> ........................................................................................................................ (16) <br>
<math> \alpha _{d} = min\left \{ \frac{-\lambda_{j}}{\delta _{\lambda_{j}}} \right \} </math> for  <math> \delta \lambda_{j} < 0 </math> ..... (17) <br><br>
<math> \alpha _{d} = min\left \{ \frac{-\lambda_{j}}{\delta _{\lambda_{j}}} \right \} </math> for  <math> \delta \lambda_{j} < 0 </math> ........................................................................................................................ (17) <br><br>


The value of the the following variables for next iteration (k+1) is determined by: <br>
The value of the the following variables for next iteration (+1) is determined by: <br>
<math> x^{k+1} = x^{k} + \alpha _{p}\cdot \delta _{x} </math> <br>
<math> x^{k+1} = x^{k} + \alpha _{p}\cdot \delta _{x} </math> <br>
<math> y^{k+1} = y^{k} + \alpha _{d}\cdot \delta _{y} </math> <br>
<math> y^{k+1} = y^{k} + \alpha _{d}\cdot \delta _{y} </math> <br>

Revision as of 00:45, 14 November 2020

Authors: Tomas Lopez Lauterio, Rohit Thakur and Sunil Shenoy
Steward: Dr. Fengqi You and Akshay Ajagekar


Introduction

Linear programming problems seeks to optimize linear functions given linear constraints. There are several applications of linear programming including inventory control, production scheduling, transportation optimization and efficient manufacturing processes. Simplex method has been a very popular method to solve these linear programming problems and has served these industries well for a long time. But over the past 40 years, there have been significant number of advances in different algorithms that can be used for solving these types of problems in more efficient ways, especially where the problems become very large scale in terms of variables and constraints. In early 1980s Karmarkar (1984) published a paper introducing interior point methods to solve linear-programming problems. A simple way to look at differences between simplex method and interior point method is that a simplex method moves along the edges of a polytope towards a vertex having a lower value of the cost function, whereas an interior point method begins its iterations inside the polytope and moves towards the lowest cost vertex without regard for edges. This approach reduces the number of iterations needed to reach that vertex, thereby reducing computational time needed to solve the problem.

Lagrange Function

Before getting too deep into description of Interior point method, there are a few concepts that are helpful to understand. First key concept to understand is related to Lagrange function. Lagrange function incorporates the constraints into a modified objective function in such a way that a constrained minimizer (x*) is connected to an unconstrained minimizer {x*, λ*} for the augmented objective function L(x,λ), where the augmentation is achieved with 'p' Lagrange multipliers.
To illustrate this point, if we consider a simple an optimization problem: minimize f(x) subject to: A·x = b, where A ε Rpxn is assumed to have a full row rank Lagrange function can be laid out as:


where, 'λ' introduced in this equation is called Lagrange Multiplier.

Newton's Method

Another key concept to understand is regarding solving linear and non-linear equations using Newton's methods. Assume you have an unconstrained minimization problem in the form:
minimize g(x) , where g(x) is a real valued function with n variables.
A local minimum for this problem will satisfy the following system of equations:

The Newton's iteration looks like:


Theory and algorithm

Visualization of Central Path method in Interior point

We first start forming a primal-dual pair of linear programs and use the "Lagrangian function" and "Barrier function" methods to convert the constrained problems into unconstrained problems. These unconstrained problems are then solved using Newton's method as shown above.

Problem Formulation

Consider a combination of primal-dual problem below:
(Primal Problem formulation)
→ minimize Subject to: and ....................................................................................................(1)
(Dual Problem formulation)
→ maximize Subject to: and </math> \lambda \geq 0 </math> .................................................(2)
'λ' vector introduced represents the slack variables.
Now we use the "Logarithmic Barrier" function and form 2 Lagrangian equations for primal and dual forms mentioned above:
Lagrange function for Primal : ........................................(3)
Lagrange function for Dual  : ..........................(4)

Taking the partial derivatives of Lp and Ld with respect to variables 'x', 'λ' , 'y', and forcing these terms to zero, we get the following equations:
and .................................................................................................................................................(5)
and .........................................................................................................................................(6)
for j= 1,2,.......n ......................................................................................................................................(7)

where, μ is strictly positive scaler parameter. For each μ > 0, the vectors in the set {x(μ), y(μ), λ(μ)} satisfying equations (5), (6) and (7), can we viewed as set of points in Rn, Rp, Rn such that when 'μ' varies, the corresponding points form a set of trajectories called "Central Path". The central path lies in the "Interior" of the feasible regions. There is a sample illustration of "Central Path" method in figure to right. We start with a positive value of 'μ' and as 'μ' goes to 0, we approach the optimal point

Let us define the following:
X = Diagonal ()
= Diagonal ( )
eT = (1 .....1) as vector of all 1's.
Using these newly defined terms, the equation (7) can be written as:

Iterations using Newton's Method

Now we employ the Newton's iterative method to solve the following equations:
............................................................................................................................................................(8)
.........................................................................................................................................................(9)
............................................................................................................................................ (10)
Suppose we start with definition of starting point that lies within feasible region as (x0, y0, λ 0) such that x0>0 and λ 0 >0. Also let us define 2 residual vectors for both the primal and dual equations:
.....................................................................................................................................................(11)
..........................................................................................................................................(12)

Applying Newton's Method to solve equations (8) - (12) we get:

So a single iteration of Newton's method involves the following equations. For each iteration, we solve for the next value of xk+1, yk+1 and λk+1:
.....................................................................................................(13)
......................................................................................................................................................(14)
............................................................................................................................(15)
for ........................................................................................................................ (16)
for ........................................................................................................................ (17)

The value of the the following variables for next iteration (+1) is determined by:



The quantities αp and αd are positive with 0 ≤ αp, αd ≤ 1.
After each iteration of Newton's method, we assess the duality gap that is given by the expression below and compare it against a small value ε

The value of ε can be chosen to be something small 10-6, which essentially is the permissible duality gap for the problem.



Numerical Example

Applications

Conclusion

References