Next: Markovian Policy Up: Infinite Horizon Problems Previous: Example 1

The Expected Discounted Sum Return Function

Here are some possible explanations for the parameter.
1.
In economical problems the parameter may be interpreted as the interest
2.
Consider a finite horizon problem where the horizon is random, i.e.

assuming that the final value of all the states is equal to 0.

Let N be distributed geometricly with parameter . The probability to stop at the Nth step is

Lemma 4.2
Under the assumption that

Proof:

we look back at example 1 we could add to it an additional state, , that behaves as a 'black hole', see figure . Once the system reached this state it stays there forever, getting an immediate reward of value 0.
The probablity to move into state is from any state. All other probabilities given in the original example are multiplied by .

The sum of the immediate rewards from the new model is equal to the discounted sum of the immediate rewards from the original model.

Next: Markovian Policy Up: Infinite Horizon Problems Previous: Example 1
Yishay Mansour
1999-11-18