next up previous
Next: Finding the Constants mD lD Up: Proof for Phased Q-Learning Previous: Bounding

   
Putting It All Together

We've received:

\begin{displaymath}\left\Vert \widehat{Q}_l - Q^*\right\Vert \leq \left\Vert \widehat{Q}_l - Q_l \right\Vert + \left\Vert Q_l - Q^* \right\Vert
\end{displaymath} (13)

where Hence we can sum up all of the above by:

\begin{eqnarray*}\lefteqn{ \Pr( \left\Vert\widehat{Q}_l - Q^*\right\Vert \geq
...
...cdot \vert S\vert \cdot \vert A\vert \cdot 2e^{-2{m_D}{w^2}}\\
\end{eqnarray*}




Yishay Mansour
2000-05-30