Algorithms -Optimal control Example
A={+1,-1}
g = 1/2
d(si,a)= si+a
p random
s0
s1
s3
s2
R(si,a) = i
0
1
2
3
Qp(s0,+1) = 0 +g Vp(s1)
Qp(s0,+1) = 5/6
Qp(s0,-1) = 13/6
Previous slide
Next slide
Back to first slide
View graphic version