Author: Satyasri Ventrapragada (sv353), John-Anthony Gonzales (jg2533), Salina Tekele (st2257)

Stewards: Nathan Preuss, Wei-Han Chen, Tianqi Xiao, Guoqing Hu

Introduction

The eight-step procedures are a simplified, multi-stage approach for determining optimal solutions in mathematical optimization. Dynamic programming, developed by Richard Bellman in the 1950s^[1], is used to solve for the maximization or minimization of the objective function by transforming the problem into smaller steps and enumerating all the different possible solutions and finding the optimal solution.

In the eight-step procedure, a problem can be broken down into subproblems to solve. Using the solutions from the subproblems in a recursive manner, the solution can be determined after all the solutions of the subproblems are calculated to find the best solution, which demonstrates the principle of optimality: Any optimal policy has the property that, whatever the current state and current decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the current decision.^[2] Such a standard framework is used so that dynamic programming store the values of the subproblems to avoid recomputing, and thus, reduce time to solve the problem.^[3]

Theory, Methodology, and/or Algorithmic Discussion

Methodology

To solve a problem using the 8-step procedure, one must use the following steps:

Step 1: Specify the stages of the problem
The stages of a dynamic programming problem can be defined as points where decisions are made. Specifying the stages also divides the problem into smaller pieces.

Step 2: Specify the states for each stage
The states of a problem are defined as the knowledge necessary to make a decision. There are multiple states for each stage. In general, the states consists of the information that is needed to solve the smaller problem within each stage.^[4]

Step 3: Specify the allowable actions for each state in each stage
This helps create for a decision that must be made at each stage.

Step 4: Describe the optimization function using an English-language description.

Step 5: Define the boundary conditions
This can help create a starting point to finding a solution to the problem.

Step 6: Define the recurrence relation
This is often denoted with a function, and shows the relationship between the value of a decision at a particular stage and the value of optimal decision made at the previous stages.

Step 7: Compute the optimal value from the bottom-up
This step can be done manually or by using programming. Note that for each state, an optimal decision made at the remaining stages of the problem is independent from the decisions of the previous states.

Step 8: Arrive at the optimal solution
This is the final step for solving a problem using the eight step procedure.

Numerical Example

Example 1: A Smaller Size Knapsack Problem

The Smiths are trying to replicate a delicious Christmas fruitcake recipe, and the recipe calls for berries with a total of up to 11 lbs of berries packed into the fruitcake. The Smiths purchased 4 boxes of strawberries and 4 boxes of blueberries during their Christmas grocery shopping trip last weekend but are unsure of how many boxes of strawberries and blueberries to include in their fruitcake. Each box of strawberries weighs 2 lbs and each box of blueberries weighs 3 lbs. To make the choice easier, they spent all day researching how different amounts of strawberries and blueberries impact the taste of the fruitcake and made a benefit table listing the value each box of berries adds to the overall taste of the fruitcake. The benefit table is shown below in Table 1. Please use Dynamic Programming to help the Smiths choose the optimal number of boxes of strawberries and blueberries to include in their delicious Christmas fruitcake this year!

Table 1. Availability and weights of the berries.

Type of berry	Availability (number of boxes)	Weight per box (lb)
Strawberries	4	2
Blueberries	4	3

Table 2. Benefit table for number of boxes of berries.

Number of boxes of berries in fruitcake	1	2	3	4
Strawberries	2	3	5	7
Blueberries	1	2	4	5

To solve the Smiths’ fruitcake problem, we will be using dynamic programming through the following 8 steps:

Step 1: What are the stages?

The different types of items are the stages. For this problem, there are 2 stages: a box of strawberries or a box of blueberries, and the decision made in that stage is to specify the number of boxes of each berry to include in the fruitcake.

Step 2: What are the states?

The states correspond to the remaining number of boxes of berries we can include in the fruitcake. For example, if we had already made a decision to include 1 box of strawberries and 1 box of blueberries in the previous states, the current state would be 1 since the recipe requires only 3 total boxes of berries.

Step 3: What actions may be taken at each state in each stage?

If we are in stage n with availability a[n] and remaining capacity s, we can pack j items of type n into the fruitcake:

$j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}$

Step 4: What is the English-language description of the optimization function for each state s in stage n?

$f_{n}^{*}(s)$ is the value of the maximum benefit possible with items of type n (or greater) while fitting into the remaining capacity s.

Step 5: What are the boundary conditions?

The benefit is 0 when there are no more types of berries to use even though there is remaining capacity.

$f_{n+1}^{*}(s)=0$ , where N is the number of unique berries (N = 2) in this problem. Therefore, $f_{3}^{*}(s)=0$ since there is no third type of berry.

Step 6: What is the recurrence relation?

We will construct a benefit function corresponding to each remaining capacity s for each of the items n, . . . , N to use j number of boxes of type n berries in the fruitcake.

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\{b[n,j]+f_{n+1}^{*}(s-jw[n])\}$

Step 7: Compute optimal values in a bottom up fashion.

Let $U_{n}^{*}(s)$ be the optimal number of boxes of berry type n where n=1 corresponds to strawberries and n=2 corresponds to blueberries.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11					0
10					0
9					0
8					0
7					0
6					0
5					0
4					0
3					0
2					0
1					0
0					0

Let’s start with our first state, s=11, where there are 11 lbs of berries left to add to the fruitcake.

$U_{2}(11)=\{0,1,\dots ,\min\{a[2],\lfloor 11w[2]\rfloor \}\}$

$U_{2}(11)=\{0,1,2,3\}\quad {\text{since }}a[2]=4{\text{ and }}\left\lfloor {\frac {11}{3}}\right\rfloor =3.$

Recall from Step 6:

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\left\{b[n,j]+f_{n+1}^{*}(s-jw[n])\right\}$

$f_{2}^{*}(s)=\max _{j\in U_{2}(11)}\left\{b[2,j]+f_{3}^{*}(s-jw[2])\right\}$

Recall from Step 5 that $f_{3}^{*}(s)$ is simply for a boundary condition and $f_{3}^{*}(s)$ = 0.

Therefore, $f_{2}^{*}(s)=\max _{j\in U_{2}(11)}\left\{b[2,j]\right\}$

Recall from Table 1 that the maximum benefit occurs when j = 3 where $f_{2}^{*}(s)$ = 6. We can now fill in the first 3 rows of our table for n = 2 (blueberries) since the equation for U, $U_{2}(s)=\{0,1,\dots ,\min\{a[2],\lfloor sw[2]\rfloor \}\}$ ,

will not change for when s = 11, 10, and 9 and will remain as $U_{2}(s)=\{0,1,2,3\}$ . Similarly, when s = 8, 7, or 6, $U_{2}(s)=\{0,1,2\}$ , when s = 5, 4, or 3, $U_{2}(s)=\{0,1\}$ , and when s = 2, 1, or 0, $U_{2}(s)=\{0\}$ . Since a greater number of boxes of blueberries results in a greater benefit function value as shown in Table 2, the j that will give us the most benefit for n = 2 would be the maximum value of the set $U_{2}(s)$ for a given state, s. The populated values for n = 2 can be seen below.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11			4	3	0
10			4	3	0
9			4	3	0
8			2	2	0
7			2	2	0
6			2	2	0
5			1	1	0
4			1	1	0
3			1	1	0
2			0	0	0
1			0	0	0
0			0	0	0

What is the physical meaning of each row? For example, for the row when s = 7 lbs, 2 boxes of blueberries can be added into the fruitcake for maximum benefit ( $f_{2}^{*}(s)$ = 5) since each box of blueberries weighs 3 lbs and the greater the number of boxes of blueberries, the more the benefit as seen in Table 2.

Now, we want to fill in the strawberries section of our table, or n=1. To do this, we repeat Step 7:

Let’s start with our first state, s=11, where there are 11 lbs of berries left to add to the fruitcake.

$U_{1}(11)=\{0,1,\dots ,\min\{a[1],\left\lfloor {\frac {11}{w[1]}}\right\rfloor \}\}$

$U_{1}(11)=\{0,1,2,3,4\}$ since a[1] = 4 and $\left\lfloor {\frac {11}{2}}\right\rfloor$ = 5.

Recall from Step 6:

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\left\{b[n,j]+f_{n+1}^{*}(s-jw[n])\right\}$

$f_{1}^{*}(s)=\max _{j\in U_{1}(11)}\left\{b[1,j]+f_{2}^{*}(s-jw[1])\right\}$

$f_{1}^{*}(s)=\max _{j\in U_{1}(11)}\left\{b[1,j]+f_{2}^{*}(s-2j)\right\}$

Unlike before for n=1, the second term, $f_{n+1}^{*}(s-jw[n])$ , is not equal to 0. Therefore, we must calculate $f_{1}(s)$ for each j in $U_{1}(s)$ to find the optimal number of boxes of strawberries, $U_{1}^{*}(s)$ , that corresponds to the maximum benefit, $f_{1}^{*}(s)$ , for any given s. Let’s construct a table for s=11 just for clarity:

j in $U_{1}(11)$	$f_{1}(11)=\left\{b[1,j]+f_{2}^{*}(s-2j)\right\}$
0	= 0 + f2*(11) = 0 + 4 = 4
1	= 2 + f2*(9) = 2 + 4 = 6
2	= 3 + f2*(7) = 3 + 2 = 5
3	= 5 + f2*(5) = 5 + 1 = 6
4	*= 7 + f2(3) = 6 + 1 = 8**

As seen above, the maximum $f_{1}(11)$ , $f_{1}^{*}(11)$ , occurs when $U_{1}^{*}(11)$ = 4 . This means that the maximum benefit with respect to n=1 occurs when 4 boxes of strawberries are packed into the fruitcake when s=11. We will repeat this process for the rest of the states, s = 10, 9, …, 0, and the populated table is shown below.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11	8	4	5	3	0
10	7	4	5	3	0
9	7	4	5	3	0
8	7	4	2	2	0
7	5	3	2	2	0
6	5	3	2	2	0
5	3	2, 1	1	1	0
4	3	2	1	1	0
3	2	1	1	1	0
2	2	1	0	0	0
1	0	0	0	0	0
0	0	0	0	0	0

Step 8: Find the optimal solution by looking at the table

The maximum benefit possible is 8, with 4 boxes of strawberries and 1 box of blueberries added to the fruitcake. Here are the steps after constructing the table to get to that final answer:

Where does the maximum benefit occur? The maximum value for the benefit is 8, and this occurs at s = 11 where $f_{1}^{*}(11)$ = 8 .
How many boxes of strawberries does $f_{1}^{*}(11)$ = 8 correspond to? $U_{1}^{*}(s)$ = 4 when $f_{1}^{*}(11)$ = 8 as seen in the table, so 4 boxes of strawberries.
How many lbs of fruit remain to add into the fruitcake? The total capacity is 11 lbs, and 4 boxes of strawberries each weighing 2 lbs were already added, so $11-(4\times 2)=3$ . 3 lbs of fruit remain to add into the fruitcake.
What is the optimal number of boxes of blueberries that should be added according to the table? Find s = 3, and find $U_{2}^{*}(3)$ . $U_{2}^{*}(3)$ = 1, so 1 box of blueberries should be added.

Final Answer: The Smiths should add 4 boxes of strawberries and 1 box of blueberries to their fruitcake.

Example 2: Layered Graphs

Find the shortest path in this layered graph using the concepts of dynamic programming.

We can partition the nodes into layers where the first layer is s, the second layer is A, B, and C, the third layer is D and E, the fourth layer is F and G, and the fifth layer is t. We will make a decision of the path to take at each layer, and each decision will correspond to a stage of the dynamic programming problem.

To solve this shortest path problem, we will be using the previously explained 8 steps to solve the dynamic programming problem.

What are the stages?
- The stages are the different layers. There are 5 stages.
What are the states?
- The states are nodes in each layer.
What actions may be taken at each state in each stage?
- At each state in each stage, we will have to make a decision for which edge to pick. For example, in stage 1 (blue) and state B, we can pick either the edge with distance 3 or the edge with distance 4 since both edges are leaving node B.
What is the English-language description of the optimization function for each state s in stage n?
1. $f_{i}^{*}(r)$ is the minimum total cost to go from state r in stage i to reach node t, which is the final stage.
What are the boundary conditions?
1. There is 0 cost for stage 4 since we have already reached the ending, node t.
What is the recurrence relation?
1. $f_{n}^{*}(s)=\min _{edges\ from\ r\ to\ a\ state\ p\ in\ stage\ i+1}\{c[r,p]+f_{i+1}^{*}(p)\}$
Compute the optimal path based on the recurrence relation defined in Step 6 and the boundary condition defined in Step 5.
1. We will assign a number to each node, and this number will correspond to the $f_{i}^{*}(r)$ for that node as defined above in Step 6. This will be the smallest cumulative sum possible between that node and the edge connecting that node to the next layer, or next stage. The $f_{i}^{*}(r)$ for each state in each stage are shown in red below, and the optimal path is indicated by the red arrows.
  Layered graph with $f_{i}^{*}(r)$ for each node and shortest path in red.
Find the optimal solution by looking up the table.
1. The length of the shortest path: f0*(s)=8. The shortest path can be found by following the red arrows moving forward. s → B → D → G → t.

Applications

The following are some applications where dynamic programming is used. The criteria for applying dynamic programming to an optimization problem are if the objective function involves maximization, minimization, or counting and if the problem is determined by finding all the solutions to find the optimal solution.

Shortest/ Longest Path Problem

In the shortest path problem, the path with the least amount of cost or value must be determined in a problem with multiple nodes in between the beginning node s to the final node e. Travelling from one node to another incurs a value or cost c(p, q), and the objective is to reach t with the smallest cost possible. The eight-step procedure can be used to determine the possible solutions which the optimal solution can be determined from.^[5]

Likewise, but in a maximization function, the longest path can be determined in a problem by determining the solution with the highest cost involved to travel from node s to node e.

Knapsack Problem

The knapsack problem is an example of determining the distribution of effort or when there are limited resources to be shared with competing entities, and the goal is to maximize the benefit of the distribution. Dynamic programming is used when the increase in benefit in regard to increasing the quantity of resources is not linearly proportional. The volume may also be considered in addition to the weight of the resources. A volume constraint is added to the problem and represented in the state by stage n by an ordered pair (s, v) for remaining weight and volume. By considering d constraints, the number of states can grow exponentially with a d -dimensional state space even if the value of d is small. The problem becomes infeasible to solve and is referred to as the curse of dimensionality. However, the curse has faded due to advances in computational power.^[6]

Inventory Planning Problem

In inventory management, dynamic programming is used to determine how to meet anticipated and unexpected demand in order to minimize overall costs. Tracking an inventory system involves establishing a set of policies that monitor and control the levels of inventory, determining when a stock must be replenished, and the quantity of parts to order. For example, a production schedule can be computationally solved by knowing the demand, unit production costs, and inventory supply limits in order to keep the production costs below a certain rate.^[7]

Needleman-Wunsh Algorithm (Global Sequence Alignment)

Developed by Saul B. Needleman and Christian D. Wunsch in 1970, the Needleman-Wunsh algorithm, also known as global sequence alignment, is used to find similarities within protein or nucleotide sequences. This algorithm is an application of dynamic programming used to divide a large problem such as a large sequence into smaller subproblems and the solutions of the subproblems are used to find the optimal sequences with the highest scores. A matrix is constructed consisting of strings of the protein or nucleotide sequences. A scoring system is determined for each of the nucleotide pairs (adenine, guanine, cytosine, thymine) where there could exist a match (+1), mismatch (-1), or gap (-1). The sum of the scores determine the entire alignment pair. Then the scores are calculated for the pairs and filled out in the matrix. To find the optimal alignment, one would perform a "traceback" by starting at the upper left matrix to the bottom right. The algorithm is limited in that it can align only entire proteins.^[8]

Conclusion

The eight-step procedure is an approach used in dynamic programming to transform a problem into simpler problems to yield an optimal solution. The recursive nature of the procedure allows for the optimization problems to be solved using computational models that reduce time and effort and can be used in many applications across many industries.

References

↑ Bellman, Richard. “The Theory of Dynamic Programming.” Bulletin of American Mathematical Society, vol. 60, 1954, pp 503–515, https://www.ams.org/journals/bull/1954-60-06/S0002-9904-1954-09848-8/S0002-9904-1954-09848-8.pdf. 18 Nov 2020.
↑ Bradley, Stephen P. Applied Mathematical Programming. Addison-Wesley. 1 February 1977. 320-342. 18 Nov 2020
↑ Gavin-Hughes, Sam. “Dynamic Programming for Interviews.” Byte by Byte. https://www.byte-by-byte.com/dpbook/. 18 Nov 2020
↑ Chinneck. (2015). Chapter 15 Dynamic Programming. Carleton.Ca. https://www.sce.carleton.ca/faculty/chinneck/po/Chapter15.pdf
↑ Neumann K. “Dynamic Programming Basic Concepts and Applications.” Optimization in Planning and Operations of Electric Power Systems. Physica, Heidelberg, 1993, p 31-56.
↑ Taylor, C. Robert. Applications Of Dynamic Programming To Agricultural Decision Problems. United States, CRC Press, 2019.
↑ Bellman, Richard. “Dynamic Programming Approach to Optimal Inventory Processes with Delay in Delivery.” Quarterly of Applied Mathematics, vol 18, 1961, p. 399-403, https://www.ams.org/journals/qam/1961-18-04/S0033-569X-1961-0118516-2/S0033-569X-1961-0118516-2.pdf. 19 Nov 2020
↑ Needleman, S. B. and Wunsch, C. D. "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins." J. Mol. Biol. 48, 1970, p. 443-453.

[1] Bellman, Richard. “The Theory of Dynamic Programming.” Bulletin of American Mathematical Society, vol. 60, 1954, pp 503–515, https://www.ams.org/journals/bull/1954-60-06/S0002-9904-1954-09848-8/S0002-9904-1954-09848-8.pdf. 18 Nov 2020.

[2] Bradley, Stephen P. Applied Mathematical Programming. Addison-Wesley. 1 February 1977. 320-342. 18 Nov 2020

[3] Gavin-Hughes, Sam. “Dynamic Programming for Interviews.” Byte by Byte. https://www.byte-by-byte.com/dpbook/. 18 Nov 2020

[4] Chinneck. (2015). Chapter 15 Dynamic Programming. Carleton.Ca. https://www.sce.carleton.ca/faculty/chinneck/po/Chapter15.pdf

[5] Neumann K. “Dynamic Programming Basic Concepts and Applications.” Optimization in Planning and Operations of Electric Power Systems. Physica, Heidelberg, 1993, p 31-56.

[6] Taylor, C. Robert. Applications Of Dynamic Programming To Agricultural Decision Problems. United States, CRC Press, 2019.

[7] Bellman, Richard. “Dynamic Programming Approach to Optimal Inventory Processes with Delay in Delivery.” Quarterly of Applied Mathematics, vol 18, 1961, p. 399-403, https://www.ams.org/journals/qam/1961-18-04/S0033-569X-1961-0118516-2/S0033-569X-1961-0118516-2.pdf. 19 Nov 2020

[8] Needleman, S. B. and Wunsch, C. D. "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins." J. Mol. Biol. 48, 1970, p. 443-453.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Eight step procedures

Contents