Solving the Least-Squares Problem

Next: Evaluation Of Approximate Policy Up: Approximate Policy Iteration Previous: The algorithm using Monte

Solving the Least-Squares Problem

Let $\tilde(S)$ be a set of representative states, M(s) the number of samples of $s\in \tilde{s}$ , the mth such sampled is denoted by c(s,m) and r is the vector parameter upon which the following optimisation problem is solved.

$\begin{displaymath}\min_{r}\sum_{s \in \tilde{S}}\sum_{m=1}^{M(s)}(\tilde{V}(s,r) - C(s,m) )\end{displaymath}$

The solution can be obtained by an incremental algorithm, which performs steps in the gradient direction.We will have the following equation for a certain run (s₁,a₁,....,s_n).

$\begin{displaymath}\vec{r} = \vec{r} - \alpha\sum_{k=0}^{\vert{\tilde{S}}\vert}\nabla_{r}\tilde{V}(s,r)(\tilde{V}(s,r) - C(s,k) )\end{displaymath}$

Yishay Mansour
2000-01-11