Next: Policy Sampling
Up: Evaluating One Policy With
Previous: Evaluating One Policy With
We have two sources D1(x) and D2(x) that produce differnt distributions.
We compute expectation of a function F(x) on one source while sampling the other source. The
expectation of F(x) with respect to distribution D is the sum of products of all values of X
with the probability that D assigns that value. In our case:
F(x) = k.
- We find expectancy
ED2[k] from samples of D1.
- We check the equation by computing
- One of the problems in importance sampling is the variance.
In this case.