Algorithms - optimal control

is better than p.

performs action a at state s is better than p.