Next: The Return Function Up: Infinite Horizon Problems Previous: The Expected Discounted Sum

## Markovian Policy

We will now show for every initial state, s, and a history dependent policy, , a Markovian policy such that the distribution on (Xt, Yt) is equal for and .

Theorem 4.3   Let .
Then there exists a Markovian stochastic policy ,
that sutisfies

Proof:For every and we define as follows:

We will first show that this definition results in the same distibution over the group of actions:

The first equality is derived from the fact that is Markovian.
The second equality is by definition.

It is left to be shown that the distribution over the group of states is equal under and , i.e.

We will prove this part by an induction on t. The idea behind the proof of this part is that if at a certain step we have an equal distribution over the group of states, and we are taking the same stochastic action, we will endup with same distribution over the group of states.
Basis: for in and in
Induction Step: We assume that there is an identity between the distribution over the group of states in and in until the time t-1

Recalling that
concludes this proof.

Next: The Return Function Up: Infinite Horizon Problems Previous: The Expected Discounted Sum
Yishay Mansour
1999-11-18