Next: Evaluating Policy Reward Up: Stochastic Models Previous: Choosing

## Evaluating the average

Let V1,...,Vn be i.i.d. random variables with expected value of .
We would like to solve the system
E[V]=r (this is a special case of the earlier discussion in which g(V,r)=V). We will use a single sample Vn and get :

if
then .

We compute the variance as follows :

.

We show that if
and then . The following is a general Lemma that is used and is shown later.

Lemma 7.1   Let be a series such that . and Let et be a series such that and : . Then .

Proof:step 1: we show that for every constant , for an infinite number of t's.
Let us assume (by contradiction) that there exists
T such that for all t>T, and . (since ). In such a case we have

.

Hence :

.

Since for any m we have
,then it has to be the case that , which is a contradiction to the assumption in the lemma.
Therefore, for any
, there exists T such that and .

step 2: we show that
.

Under the hypothesis that
we showed that , this implies that .

So the variance could be expressed as :

we can substitute :

Since and , we have that .

Remark : the condition
will insure convergence with probability one.

We say that
with probability one if and only if .

Next: Evaluating Policy Reward Up: Stochastic Models Previous: Choosing
Yishay Mansour
1999-12-16