Function Approximation - basics
Q(s,a) = fw(s,a)
fw: a parametric function class
Adjusting w using gradient descent:
w := w + a [ rt+1+gQ(st+1, at+1) - Q(st, at) ] Ñw fw(s,a)
gradient
estimated
value
sampled
value
Previous slide
Next slide
Back to first slide
View graphic version