Author: Satyasri Ventrapragada (sv353), John-Anthony Gonzales (jg2533), Salina Tekele (st2257)

Stewards: Nathan Preuss, Wei-Han Chen, Tianqi Xiao, Guoqing Hu

Introduction

In an attempt to innovate new ways for solving the world’s issues, scientists and mathematicians came together years ago to push forward with new ideas. After some time and iteration, dynamic programming (DP) was born –– a complex mathematical optimization method that aims to solve a wide range of problems. Dynamic programming is best used when the problem can be divided into subproblems that are solved until a globally optimal solution is reached. Considering all of this, it is important to highlight that Richard Bellman pushed hard for advancements in the early stages and can be credited with the development of dynamic programming.

As mentioned, Richard Bellman – a well renowned Stanford professor that was contracting for a non-profit research institution at the time – founded monumental principles on dynamic programming beginning in the 1950s when he started publishing formal work on the topic. In his first publishing, Bellman was forced to strategically name the new computational method due to political condemnation. To avoid using words like “research” and “mathematical,” Bellman sought to emphasize the time-varying and incremental aspects of using the algorithm; therefore, he eventually chose “dynamic programming” to effectively summarize the intent of the steps he was creating (Dreyfus).

After learning more about the principles and mechanics of dynamic programming, it is intended to aid in decision-making challenges that have the specific circumstantial opportunities for its use. In particular, anything similar to the applications that are discussed later are concrete examples of where and how dynamic programming can be implemented. As these examples will show, the main objectives are to show the ease of dynamic programming in practice, exactly how this takes place, and what some of the drawbacks and limitations might be. Having this tool widely available, easily deployable, and readily adaptable for any fitting situation has changed the way that optimization can occur for the better.

Algorithm Discussion

When tackling a dynamic programming problem, there are generally seven key steps to follow. The first and most crucial step is determining whether dynamic programming is the right approach. A key indicator of this is whether the problem can be broken down into smaller, overlapping subproblems. Problems that involve minimizing or maximizing a quantity, or those that require counting possible arrangements, are often strong candidates for dynamic programming solutions.

Once you’ve identified that dynamic programming is applicable, the next step is to identify the variables involved, particularly those that change as the subproblems are solved. These changing variables are critical to breaking the problem down into manageable pieces. To identify the key variables, it can be helpful to list out a few examples of subproblems and compare the parameters in each case. By analyzing which parameters vary between subproblems, you can determine the number of subproblems that need to be solved, which will guide the design of the solution.

The third step is to define the recurrence relation, which expresses how the solution to a larger subproblem can be constructed from solutions to smaller subproblems. The recurrence relation serves as the core of the dynamic programming approach, describing how to combine the results of smaller subproblems to build up the final solution. This relation is crucial because it provides the blueprint for how an optimal solution can be derived from previously computed values. It must be carefully defined to ensure that the solution builds progressively and efficiently. Once the recurrence relation is established, the next step is to identify the base cases. Base cases often represent the simplest, smallest possible inputs for which the solution is trivial or already known. For example, in a problem like Fibonacci number calculation, the base cases are typically the first two numbers of the sequence. Base cases are essential because they serve as the foundation for building up solutions to more complex subproblems. They also act as stopping conditions for the recursion or termination conditions for the iterative approach, ensuring that the algorithm does not run indefinitely.

The next decision is whether to implement a recursive or an iterative solution, which corresponds to the choice between a top-down (memoization) approach and a bottom-up (tabulation) approach. In a top-down approach, recursion is used to break the problem into smaller subproblems, with results stored in a cache (or memoization table) to avoid redundant calculations. This approach is typically easier to conceptualize and implement but can lead to stack overflow issues for deep recursions in some languages. In contrast, the bottom-up approach builds the solution iteratively, solving smaller subproblems first and progressively solving larger ones. This method tends to be more efficient in terms of space because it does not require the overhead of recursion but might be harder to understand initially. Both methods ultimately achieve the same goal, and the choice often depends on the specific problem and the programmer's preference.

The next step is to incorporate memoization (if using the top-down approach) or tabulation (for the bottom-up approach). Memoization improves performance by storing the results of expensive function calls so that they don’t need to be recomputed. When a subproblem is solved for the first time, its result is stored in a table or dictionary. If the same subproblem needs to be solved again, the stored result is returned instead of recalculating it. This prevents redundant work, leading to significant time savings, particularly for problems with overlapping subproblems.

Finally, once you have implemented memoization (or tabulation), the last step is to analyze the time complexity of your solution. In dynamic programming, the time complexity is determined by two factors: the number of unique subproblems that must be solved, and the time required to solve each subproblem. The number of unique subproblems is determined by the number of distinct states defined by the changing parameters, while the time complexity per subproblem typically depends on the amount of work required to compute each solution. By carefully counting the number of subproblems and the cost of solving each one, you can estimate the overall time complexity of your dynamic programming solution. Often, dynamic programming solutions reduce the time complexity from exponential to polynomial time, making otherwise intractable problems solvable.

Numerical Examples

Example 1: Knapsack Problem

The Smiths are trying to replicate a delicious Christmas fruitcake recipe, and the recipe calls for berries with a total of up to 11 lbs of berries packed into the fruitcake. The Smiths purchased 4 boxes of strawberries and 4 boxes of blueberries during their Christmas grocery shopping trip last weekend but are unsure of how many boxes of strawberries and blueberries to include in their fruitcake. Each box of strawberries weighs 2 lbs and each box of blueberries weighs 3 lbs. To make the choice easier, they spent all day researching how different amounts of strawberries and blueberries impact the taste of the fruitcake and made a benefit table listing the value each box of berries adds to the overall taste of the fruitcake. The benefit table is shown below in Table 1. Please use Dynamic Programming to help the Smiths choose the optimal number of boxes of strawberries and blueberries to include in their delicious Christmas fruitcake this year!

Table 1. Availability and weights of the berries.

Type of berry	Availability (number of boxes)	Weight per box (lb)
Strawberries	4	2
Blueberries	4	3

Table 2. Benefit table for number of boxes of berries.

Number of boxes of berries in fruitcake	1	2	3	4
Strawberries	2	3	5	7
Blueberries	1	2	4	5

To solve the Smiths’ fruitcake problem, we will be using dynamic programming through the following 8 steps:

Step 1: What are the stages?

The different types of items are the stages. For this problem, there are 2 stages: a box of strawberries or a box of blueberries, and the decision made in that stage is to specify the number of boxes of each berry to include in the fruitcake.

Step 2: What are the states?

The states correspond to the remaining number of boxes of berries we can include in the fruitcake. For example, if we had already made a decision to include 1 box of strawberries and 1 box of blueberries in the previous states, the current state would be 1 since the recipe requires only 3 total boxes of berries.

Step 3: What actions may be taken at each state in each stage?

If we are in stage n with availability a[n] and remaining capacity s, we can pack j items of type n into the fruitcake:

$j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}$

Step 4: What is the English-language description of the optimization function for each state s in stage n?

$f_{n}^{*}(s)$ is the value of the maximum benefit possible with items of type n (or greater) while fitting into the remaining capacity s.

Step 5: What are the boundary conditions?

The benefit is 0 when there are no more types of berries to use even though there is remaining capacity.

$f_{n+1}^{*}(s)=0$ , where N is the number of unique berries (N = 2) in this problem. Therefore, $f_{3}^{*}(s)=0$ since there is no third type of berry.

Step 6: What is the recurrence relation?

We will construct a benefit function corresponding to each remaining capacity s for each of the items n, . . . , N to use j number of boxes of type n berries in the fruitcake.

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\{b[n,j]+f_{n+1}^{*}(s-jw[n])\}$

Step 7: Compute optimal values in a bottom up fashion.

Let $U_{n}^{*}(s)$ be the optimal number of boxes of berry type n where n=1 corresponds to strawberries and n=2 corresponds to blueberries.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11					0
10					0
9					0
8					0
7					0
6					0
5					0
4					0
3					0
2					0
1					0
0					0

Let’s start with our first state, s=11, where there are 11 lbs of berries left to add to the fruitcake.

$U_{1}(11)=\{0,1,\dots ,\min\{a[2],\left\lfloor {\frac {11}{w[2]}}\right\rfloor \}\}$

$U_{2}(11)=\{0,1,2,3\}\quad {\text{since }}a[2]=4{\text{ and }}\left\lfloor {\frac {11}{3}}\right\rfloor =3.$

Recall from Step 6:

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\left\{b[n,j]+f_{n+1}^{*}(s-jw[n])\right\}$

$f_{2}^{*}(s)=\max _{j\in U_{2}(11)}\left\{b[2,j]+f_{3}^{*}(s-jw[2])\right\}$

Recall from Step 5 that $f_{3}^{*}(s)$ is simply for a boundary condition and $f_{3}^{*}(s)$ = 0.

Therefore, $f_{2}^{*}(s)=\max _{j\in U_{2}(11)}\left\{b[2,j]\right\}$

Recall from Table 1 that the maximum benefit occurs when j = 3 where $f_{2}^{*}(s)$ = 6. We can now fill in the first 3 rows of our table for n = 2 (blueberries) since the equation for U, $U_{2}(s)=\{0,1,\dots ,\min\{a[2],\lfloor sw[2]\rfloor \}\}$ ,

will not change for when s = 11, 10, and 9 and will remain as $U_{2}(s)=\{0,1,2,3\}$ . Similarly, when s = 8, 7, or 6, $U_{2}(s)=\{0,1,2\}$ , when s = 5, 4, or 3, $U_{2}(s)=\{0,1\}$ , and when s = 2, 1, or 0, $U_{2}(s)=\{0\}$ . Since a greater number of boxes of blueberries results in a greater benefit function value as shown in Table 2, the j that will give us the most benefit for n = 2 would be the maximum value of the set $U_{2}(s)$ for a given state, s. The populated values for n = 2 can be seen below.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11			4	3	0
10			4	3	0
9			4	3	0
8			2	2	0
7			2	2	0
6			2	2	0
5			1	1	0
4			1	1	0
3			1	1	0
2			0	0	0
1			0	0	0
0			0	0	0

What is the physical meaning of each row? For example, for the row when s = 7 lbs, 2 boxes of blueberries can be added into the fruitcake for maximum benefit ( $f_{2}^{*}(s)$ = 5) since each box of blueberries weighs 3 lbs and the greater the number of boxes of blueberries, the more the benefit as seen in Table 2.

Now, we want to fill in the strawberries section of our table, or n=1. To do this, we repeat Step 7:

Let’s start with our first state, s=11, where there are 11 lbs of berries left to add to the fruitcake.

$U_{1}(11)=\{0,1,\dots ,\min\{a[1],\left\lfloor {\frac {11}{w[1]}}\right\rfloor \}\}$

$U_{1}(11)=\{0,1,2,3,4\}$ since a[1] = 4 and $\left\lfloor {\frac {11}{2}}\right\rfloor$ = 5.

Recall from Step 6:

$f_{n}^{*}(s)=\max _{j=0,1,\dots ,\min\{a[n],\lfloor sw[n]\rfloor \}}\left\{b[n,j]+f_{n+1}^{*}(s-jw[n])\right\}$

$f_{1}^{*}(s)=\max _{j\in U_{1}(11)}\left\{b[1,j]+f_{2}^{*}(s-jw[1])\right\}$

$f_{1}^{*}(s)=\max _{j\in U_{1}(11)}\left\{b[1,j]+f_{2}^{*}(s-2j)\right\}$

Unlike before for n=1, the second term, $f_{n+1}^{*}(s-jw[n])$ , is not equal to 0. Therefore, we must calculate $f_{1}(s)$ for each j in $U_{1}(s)$ to find the optimal number of boxes of strawberries, $U_{1}^{*}(s)$ , that corresponds to the maximum benefit, $f_{1}^{*}(s)$ , for any given s. Let’s construct a table for s=11 just for clarity:

j in $U_{1}(11)$	$f_{1}(11)=\left\{b[1,j]+f_{2}^{*}(s-2j)\right\}$
0	= 0 + $f_{2}^{*}(11)$ = 0 + 4 = 4
1	= 2 + $f_{2}^{*}(9)$ = 2 + 4 = 6
2	= 3 + $f_{2}^{*}(7)$ = 3 + 2 = 5
3	= 5 + $f_{2}^{*}(5)$ = 5 + 1 = 6
4	*= 7 + $f_{2}^{}(3)$ = 6 + 1 = 8**

As seen above, the maximum $f_{1}(11)$ , $f_{1}^{*}(11)$ , occurs when $U_{1}^{*}(11)$ = 4 . This means that the maximum benefit with respect to n=1 occurs when 4 boxes of strawberries are packed into the fruitcake when s=11. We will repeat this process for the rest of the states, s = 10, 9, …, 0, and the populated table is shown below.

Unused capacity s (lbs of berries left to put into fruitcake)	*$f_{1}^{}(s)$ (maximum benefit function for strawberries)**	Type 1 opt *$U_{1}^{}(s)$ (optimal number of boxes of strawberries)**	*$f_{2}^{}(s)$ (maximum benefit function for blueberries)**	Type 2 opt *$U_{2}^{}(s)$ (optimal number of boxes of blueberries)**	*$f_{3}^{}(s)$**
11	8	4	5	3	0
10	7	4	5	3	0
9	7	4	5	3	0
8	7	4	2	2	0
7	5	3	2	2	0
6	5	3	2	2	0
5	3	2, 1	1	1	0
4	3	2	1	1	0
3	2	1	1	1	0
2	2	1	0	0	0
1	0	0	0	0	0
0	0	0	0	0	0

Step 8: Find the optimal solution by looking at the table

The maximum benefit possible is 8, with 4 boxes of strawberries and 1 box of blueberries added to the fruitcake. Here are the steps after constructing the table to get to that final answer:

Where does the maximum benefit occur? The maximum value for the benefit is 8, and this occurs at s = 11 where $f_{1}^{*}(11)$ = 8 .
How many boxes of strawberries does $f_{1}^{*}(11)$ = 8 correspond to? $U_{1}^{*}(s)$ = 4 when $f_{1}^{*}(11)$ = 8 as seen in the table, so 4 boxes of strawberries.
How many lbs of fruit remain to add into the fruitcake? The total capacity is 11 lbs, and 4 boxes of strawberries each weighing 2 lbs were already added, so $11-(4\times 2)=3$ . 3 lbs of fruit remain to add into the fruitcake.
What is the optimal number of boxes of blueberries that should be added according to the table? Find s = 3, and find $U_{2}^{*}(3)$ . $U_{2}^{*}(3)$ = 1, so 1 box of blueberries should be added.

Final Answer: The Smiths should add 4 boxes of strawberries and 1 box of blueberries to their fruitcake.

Example 2: Layered Graphs

Find the shortest path in this layered graph using the concepts of dynamic programming.

We can partition the nodes into layers where the first layer is s, the second layer is A, B, and C, the third layer is D and E, the fourth layer is F and G, and the fifth layer is t. We will make a decision of the path to take at each layer, and each decision will correspond to a stage of the dynamic programming problem.

To solve this shortest path problem, we will be using the previously explained 8 steps to solve the dynamic programming problem.

What are the stages?
- The stages are the different layers. There are 5 stages.
What are the states?
- The states are nodes in each layer.
What actions may be taken at each state in each stage?
- At each state in each stage, we will have to make a decision for which edge to pick. For example, in stage 1 (blue) and state B, we can pick either the edge with distance 3 or the edge with distance 4 since both edges are leaving node B.
What is the English-language description of the optimization function for each state s in stage n?
- $f_{i}^{*}(r)$ is the minimum total cost to go from state r in stage i to reach node t, which is the final stage.
What are the boundary conditions?
- There is 0 cost for stage 4 since we have already reached the ending, node t.
What is the recurrence relation?
- $f_{n}^{*}(s)=\min _{edges\ from\ r\ to\ a\ state\ p\ in\ stage\ i+1}\{c[r,p]+f_{i+1}^{*}(p)\}$
Compute the optimal path based on the recurrence relation defined in Step 6 and the boundary condition defined in Step 5.
- We will assign a number to each node, and this number will correspond to the $f_{i}^{*}(r)$ for that node as defined above in Step 6. This will be the smallest cumulative sum possible between that node and the edge connecting that node to the next layer, or next stage. The $f_{i}^{*}(r)$ for each state in each stage are shown in red below, and the optimal path is indicated by the red arrows.
  Layered graph with $f_{i}^{*}(r)$ for each node and shortest path in red.
Find the optimal solution by looking up the table.
- The length of the shortest path: f0*(s)=8. The shortest path can be found by following the red arrows moving forward. s → B → D → G → t.

Applications

As suggested in previous sections, the principles of dynamic programming enable it to be used in a variety of industries ranging from finance and transportation to retail and agriculture. In each of these sectors, the fundamental steps outlined in the algorithm discussion are applied toward their individual specific circumstances. Due to the vastness of its applicability, the basics of dynamic programming are embedded in a variety of software programs such as any GPS system or computerized game where a user can play against an algorithm. There are pieces of dynamic programming within almost any optimization application that invokes a hierarchical decision-making process.

Bus Route Selection

According to a study by engineering students Zhu Wenfei and LI Runmei, dynamic programming is the backbone of creating the most effective path for public transportation, for both the passengers and the city. The study focused on maximizing profit for the bus carriers and passengers, simultaneously. Though factors like distance between stops, average load factor, bus capacity and others are complex, these all affect how the bus operates most efficiently. Furthermore, these factors help form constraints included in models for passenger flow, passenger “satisfaction,” and bus scheduling and speed. Employing the process seen in the examples above, they determined that optimal intervals (Figure 1) where the bus can be most cost effective and useful for civilians. With a 40/60 weight on the two parties in the objective function, the optimization results show a very small decrease in satisfaction for the bus carriers, while a substantial increase for the passengers.

Hybrid Vehicle Optimization

Because of how many variables are present in vehicle dynamics, it is certainly a prime application for dynamic programming. Specifically, there are unique objective functions for all types of vehicles due to their range in purpose. Lukic and Wang’s study on hybrid engine performance simulated a vehicle's configuration and how a set of parameters affected fuel consumption. For example, the effect of vehicle mass, planetary gear ratios, frontal surface area, maximum battery capacity, and few others were all evaluated to maximize the fuel economy (Wang and Lukic).

Conclusion

In conclusion, dynamic programming has changed the way that optimization problems can be solved.. Solving a dynamic programming problem begins with a critical assessment of whether dynamic programming is the right approach. This is determined by analyzing the problem's structure—specifically, whether it can be broken down into smaller, overlapping subproblems. If the problem exhibits these characteristics, it suggests that dynamic programming can provide an efficient solution.

DP is a valuable optimization method because of its ability to split and store subproblems to reach a global optimal value. Even further, it has the capability of working with all types of data whether it be complex numerical models or descriptions as strings of text. As seen with the examples in previous sections, a multitude of sectors benefit from the strengths and flexibility of DP. Considering dynamic programming has been developed into the largely inclusive tool that it is now, there are not many paths for its future directions. The algorithm has been and is continuously thoroughly examined and incorporated into viable solution spaces.

Eight step procedures

Contents