Stochastic dynamic programming: Difference between revisions

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Jump to navigation Jump to search
Line 98: Line 98:


==Applications==
==Applications==
- Water resources and reservoir operation
- '''Water resources and reservoir operation'''
For hydropower generation, we need to store water in reservoirs. The effect of storing water on downstream ecosystems should be considered. It is of paramount importance to let a flowrate of water to flow into the downstream to make sure the system is ecologically sustainable. The flowrate should be minimum, also known as the minimum environmental flow, as it directly affects the hydropower generation and the associated benefit. It is very important when constructing dam. Stochastic dynamic programming has established itself as a powerful technique for analyzing hydropower systems and scheduling reservoir operation.  
For hydropower generation, we need to store water in reservoirs. The effect of storing water on downstream ecosystems should be considered. It is of paramount importance to let a flowrate of water to flow into the downstream to make sure the system is ecologically sustainable. The flowrate should be minimum, also known as the minimum environmental flow, as it directly affects the hydropower generation and the associated benefit. It is very important when constructing dam. Stochastic dynamic programming has established itself as a powerful technique for analyzing hydropower systems and scheduling reservoir operation.  
- Production planning
- '''Production planning'''
The problem involves determining the production and, if applicable, the loss in the demand each period of time to minimize the total production cost during a time horizon. The production capacity limits, warehouse capacity limits, lot size, and demand act as the constraints of this problem. Uncertainty is inherently available on account of erroneous predictions of the demand, production cost, and resource availability.   
The problem involves determining the production and, if applicable, the loss in the demand each period of time to minimize the total production cost during a time horizon. The production capacity limits, warehouse capacity limits, lot size, and demand act as the constraints of this problem. Uncertainty is inherently available on account of erroneous predictions of the demand, production cost, and resource availability.   
- Animal husbandry
- '''Animal husbandry'''
Stochastic Dynamic Programming has also been applied to the problem of determining the replacement policy that is economically optimal in swine breeding herds. The objective is to maximize the total profit that can be achieved given the number of existing sows in the herd and also the number of the replacement gilts over a planning horizon. As every other dynamic programming problem, the process starts from the last stage and proceeds backwards. The value that can be achieved from the available sows at the final stage is their slaughter value. The decision is to either keep or replace a sow. The problem is prone to uncertainty owing to involuntary reasons for culling sows such as disease or death and different reproduction characteristic between individual sows available in the herd.   
Stochastic Dynamic Programming has also been applied to the problem of determining the replacement policy that is economically optimal in swine breeding herds. The objective is to maximize the total profit that can be achieved given the number of existing sows in the herd and also the number of the replacement gilts over a planning horizon. As every other dynamic programming problem, the process starts from the last stage and proceeds backwards. The value that can be achieved from the available sows at the final stage is their slaughter value. The decision is to either keep or replace a sow. The problem is prone to uncertainty owing to involuntary reasons for culling sows such as disease or death and different reproduction characteristic between individual sows available in the herd.   
- Agriculture
- '''Agriculture'''
Stochastic Dynamic Programming has been utilized to cope with the problem of weed control. Sells has found in more powerful than simulation-based approaches in determining the optimal weed control strategies. The weed seedbanks of the available weeds can be regarded as the state. The decision to be made is the type of crop to cultivate, time of planting, cultivations prior to planting, type of herbicide, rate of herbicide used, and time of spraying herbicide. The objective function to be minimized can be the total expected cost over a planning horizon.  Uncertainty needs to be considered in such problems due to uncertain performance of herbicides.   
Stochastic Dynamic Programming has been utilized to cope with the problem of weed control. Sells has found in more powerful than simulation-based approaches in determining the optimal weed control strategies. The weed seedbanks of the available weeds can be regarded as the state. The decision to be made is the type of crop to cultivate, time of planting, cultivations prior to planting, type of herbicide, rate of herbicide used, and time of spraying herbicide. The objective function to be minimized can be the total expected cost over a planning horizon.  Uncertainty needs to be considered in such problems due to uncertain performance of herbicides.   
- Energy systems
- Energy systems

Revision as of 21:03, 25 November 2021

Authors: Bo Yuan, Ali Amadeh, Max Greenberg, Raquel Sarabia Soto and Claudia Valero De la Flor (CHEME/SYSEN 6800, Fall 2021)

Introduction

Theory, methodology and algorithm discussion

Theory

Stochastic dynamic programming combines stochastic programming and dynamic programming. Therefore, to understand better what it is, we give first two definitions:

  • Stochastic programming. Unlike in a deterministic problem, where a decision’s outcome is only determined by the decision itself and all the parameters are known, in stochastic programming there is uncertainty and the decision results in a distribution of transformations.
  • Dynamic programming. It is an optimization method that consists in dividing a complex problem into easier subprobems and solving them recursively to find the optimal sub-solutions which lead to the complex problem optima.

In any stochastic dynamic programming problem, we must define the following concepts:

  • Policy, which is the set of rules used to make a decision.
  • Initial vector, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p\in\ D} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D} is a finite closed region.
  • Choice made, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle q} where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle q\in\ S} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S} is a set of possible choices.
  • Stochastic vector, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z} .
  • Distribution function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle dG_q(p,z)} , associated with Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z} and dependent on Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle q} .
  • Return, which is the expected value of the function after the final stage.

In a stochastic dynamic programming problem, we assume that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z} is known after the decision of stage Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n-1} has been made and before the decision of stage Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n} has to be made.

Methodology and algorithm

First, we define the N-stage return obtained using the optimal policy and starting with vector Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p} :

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_N\left(p\right)=\max{R\left(p_N\right)}}
    where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle R\left(p_N\right)}
 is the function of the final state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_N}

Second, we define the initial transformation as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle T_q} , and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z} , as the state resulting from it. The return after Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N-1} stages will be Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_{N-1}(z)} using the optimal policy. Therefore, we can formulate the expected return due to the initial choice made in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle T_q} :

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \int_{z\in D}{f_{N-1}\left(z\right)dG_q(p,z)}}

Having defined that, the recurrence relation can be expressed as:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_N\left(p\right)=\max{\ \int_{z\in D}{f_{N-1}\left(z\right)\ dG_q\left(p,z\right)\ }}}
     Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N\geq2}

With:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle f_1\left(p\right)=\max{\ \ \int_{z\in D}{R\left(z\right)dG_q(p,z)}}}

This formulation presented is very general and depending on the problem characteristics, there have been developed different models. For this reason, we present the algorithm of two different models as examples: a finite stage-model and a model for Approximate Dynamic Programming (ADP).

Finite-stage model: a stock-option model

This model was created to maximize the expected profit that we can obtain in N days (stages) from selling/buying stocks. This is considered a finite-stage model because we know in advance for how many days are we calculating the expected profit.

First, we define the stock price on the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k} th day Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (k\geq1)} as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_k} . We assume the following:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_{k+1}=S_k+X_{k+1}=S_0+\sum_{i=1}^{k+1}X_i}

Where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_1,\ X_2,} … are independent of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_0} and between them, and identically distributed with distribution Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F} .

Second, we also assume that we have the chance to buy a stock at a fixed price Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle c} and this stock can be sold at price Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s} . We then define Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_n} as the maximal expected profit, and it satisfies the following optimality equation:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_n\left(s\right)=\max{\ \ [}s-c,\ \int{V_{n-1}\left(s+x\right)dF\left(x\right)}\ ]}

And the boundary condition is the following:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_0\left(s\right)=\max{\ \left(s-c,\ 0\right)}}

Approximate Dynamic Programming (ADP)

Approximate dynamic programming (ADP) is an algorithm strategy for solving complex problems that can be stochastic. Since the topic of this page is Stochastic Dynamic Programming, we will discuss ADP from this perspective.

To develop the ADP algorithm, we present the Bellman’s equation using the expectation form.

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_t\left(s\right)=max\ \left(C\left(S_t,\ x_t\right)+\gamma\ E\ \left\{V_{t+1}\left(S_{t+1}\right)|S_{t=s}\right\}\right)\ }
    where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_{t+1}=S^M\left(S_t,\ x_t,\ W_{t+1}\right)}
 and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_t=X^\pi\left(S_t\right)}

The variables used and their meanings are the following:

  • State of the system, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t}
  • Function Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X^\pi\left(S_t\right)} . It represents the policy to make a decision
  • Transition function, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S^M} . It describes the transition from state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t} to state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_{t+1}} .
  • Action taken in state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_t}
  • Information observed after taking action Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_t} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle W_{t+1}}
  • Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_t\left(s\right)} gives the expected value of being in state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t} at time Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t} and making a decision following the optimal policy.

The goal of ADP is to replace the value of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_t\left(S_t\right)} with a statistical approximation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t} . Therefore, after iteration Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n-1} , we have an approximation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t^{n-1}\left(S_t\right)} . Another feature of ADP is that it steps forward in time. To go from one iteration to the following, we define our decision function as:

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X^\pi\left(S_t^n\right)=max\ \left(C\left(S_t^n,\ x_t\right)+\gamma\ E\ \left\{{\bar{V}}_{t+1}^{n-1}\left(S_{t+1}\right)|S_t^n\right\}\right)\ }

Next, we define Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_t^n} as the value of Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_t} that solves this problem and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\hat{v}}_t^n} , as the estimate value of being in state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t^n} :

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\hat{v}}_t^n=C\left(S_t^n,\ x_t^n\right)+\gamma\ E\ \left\{{\bar{V}}_{t+1}^{n-1}\left(S_{t+1}\right)|\ S_t\right\}}

Finally, using the approximation lookup table, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t\left(s\right)} is defined. In a lookup approximation, for each state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s} we have a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t(s)} that gives an approximated value of being in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s} .

  Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t^n\left(S_t^n\right)=\left(1-\alpha_{n-1}\right){\bar{V}}_t^{n-1}\left(S_t^n\right)+\alpha_{n-1}{\hat{v}}_t^n}
      where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \alpha_{n-1}}
 is known as a stepsize

Therefore, a generic ADP algorithm applied to stochastic dynamic programming can be summarized as:

1. We initialize Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t^0\left(S_t\right)} for all states Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t} , and we choose an initial state Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_o^1} and set Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n=1} .

2. For t=0,1,2.., we have to solve:

     Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\hat{v}}_t^n=C\left(S_t^n,\ x_t^n\right)+\gamma\ E\ \left\{{\bar{V}}_{t+1}^{n-1}\left(S_{t+1}\right)|\ S_t\right\}}

       Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t^n\left(S_t^n\right)=\left(1-\alpha_{n-1}\right){\bar{V}}_t^{n-1}\left(S_t^n\right)+\alpha_{n-1}{\hat{v}}_t^n}
     if Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_y=S_t^n}

       Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {\bar{V}}_t^{n-1}\left(S_t\right)}
     if Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_t\neq\ S_t^n}

     Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle S_{t+1}^n=S^M\left(S_t^n,\ x_t^n,\ W_{t+1}\left(w^n\right)\right)}

3. Set Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n=n+1} as long as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n<N} .

Numerical examples

Gamblling example

Consider an unfair coin flip game where the probability of the coin landing on heads is 0.6 and the probability of landing in tails is 0.4, and the game is each you flip head you win the amount you bet and each time you flip tails you lose the amount you bet as well. A gambler starts with 100$ dollar, cannot bet more money than he has and can play 10 games (the bet must be nonnegative). Our goal is to maximize his expected payout by using an optimal betting strategy. To find this we will use stochastic dynamic programming: Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle V_n(x) = \max_{0<&alpha<1} \, \\{[0.6*V_n-1(x+&alpha*x)+0.4*V_n-1(x-&alpha*x)]}}

Applications

- Water resources and reservoir operation For hydropower generation, we need to store water in reservoirs. The effect of storing water on downstream ecosystems should be considered. It is of paramount importance to let a flowrate of water to flow into the downstream to make sure the system is ecologically sustainable. The flowrate should be minimum, also known as the minimum environmental flow, as it directly affects the hydropower generation and the associated benefit. It is very important when constructing dam. Stochastic dynamic programming has established itself as a powerful technique for analyzing hydropower systems and scheduling reservoir operation. - Production planning The problem involves determining the production and, if applicable, the loss in the demand each period of time to minimize the total production cost during a time horizon. The production capacity limits, warehouse capacity limits, lot size, and demand act as the constraints of this problem. Uncertainty is inherently available on account of erroneous predictions of the demand, production cost, and resource availability. - Animal husbandry Stochastic Dynamic Programming has also been applied to the problem of determining the replacement policy that is economically optimal in swine breeding herds. The objective is to maximize the total profit that can be achieved given the number of existing sows in the herd and also the number of the replacement gilts over a planning horizon. As every other dynamic programming problem, the process starts from the last stage and proceeds backwards. The value that can be achieved from the available sows at the final stage is their slaughter value. The decision is to either keep or replace a sow. The problem is prone to uncertainty owing to involuntary reasons for culling sows such as disease or death and different reproduction characteristic between individual sows available in the herd. - Agriculture Stochastic Dynamic Programming has been utilized to cope with the problem of weed control. Sells has found in more powerful than simulation-based approaches in determining the optimal weed control strategies. The weed seedbanks of the available weeds can be regarded as the state. The decision to be made is the type of crop to cultivate, time of planting, cultivations prior to planting, type of herbicide, rate of herbicide used, and time of spraying herbicide. The objective function to be minimized can be the total expected cost over a planning horizon. Uncertainty needs to be considered in such problems due to uncertain performance of herbicides. - Energy systems

Conclusions

References