Comparing TD and MC
Monte Carlo update: Rt-V(s), where Rt is the return from s.
Temporal Differences update: rt +gV(s’)-V(s)
Common Belief in RL: TD is superior to MC.
CHALLENGE: Give a formal justification.
TD(l) - a family of algorithms between MC and TD.