Next:
Q-learning
Up:
No Title
Previous:
conclusion:
Q-learning and SARSA algorithms
In this section we descuss off-line and on-line algorithms to compute
the optimal policy in case the exact model is not known.
Q-learning
remarks:
SARSA
Yishay Mansour
2000-01-07