Large Scale MDP - Generative Model.
Algorithm for estimating optimal policy (for discounted infinite
horizon) in time independent of the number of states
[KMN, IJCAI99].
Clearly we can not use matrix representation.
Previous slide
Next slide
Back to first slide
View graphic version