Algorithms - Policy Evaluation Example
A={+1,-1}
g = 1/2
d(si,a)= si+a
p random
s0
s1
s3
s2
0
1
2
3
Vp(s0) = 0 +g (p(s0,+1)Vp(s1) + p(s0,-1) Vp(s3) )
"a: R(si,a) = i
Previous slide
Next slide
Back to first slide
View graphic version