Algorithms -Policy Evaluation Example
A={+1,-1}
g = 1/2
d(si,a)= si+a
p random
s0
s1
s3
s2
"a: R(si,a) = i
0
1
2
3
Vp(s0) = 0 + (Vp(s1) + Vp(s3) )/4
Vp(s0) = 5/3
Vp(s1) = 7/3
Vp(s2) = 11/3
Vp(s3) = 13/3
Previous slide
Next slide
Back to first slide
View graphic version