Planning - Policy Evaluation
Discounted infinite horizon (Bellman Eq.)
Rewrite the expectation
Linear system of equations.
Vp(s) = E [ R(s,p (s)) + g Vp(s’)]
Previous slide
Next slide
Back to first slide
View graphic version