MDP model - summary
- set of states, |S|=n.
- set of k actions, |A|=k.
- transition function.
- immediate reward function.
- policy.
- discounted cumulative return.
R(s,a)
Previous slide
Next slide
Back to first slide
View graphic version