next up previous
Next: Notations Up: Reinforcement Learning - Final Previous: Reinforcement Learning - Final

   
Introduction

In this project I've studied the article: "Finite-Sample Convergence Rates for Q-learning and Indirect Algorithms", by Michael Kearns and Satinder Singh. The article discusses the amount of experience needed for achieving a policy with a certain level of performance guarantee by the learning algorithms: Phased-Q-Learning (which is a variant of the familiar Q-Learning) and the indirect algorithm (both algorithms are explained later on); and compares the behaviour of the direct (Phased-Q-Learning) and indirect algorithms. The article presents a theorem but does not prove it, so the heart of my project is presenting a proof to the theorem.

Yishay Mansour
2000-05-30