next up previous
Next: Immediate reward and the Up: Introduction Previous: The Problem

states and actions

Lets define S to be the set of states of the system. For each state $s\in{S}$ we define the set of actions As as the actions that the agent may perform in a state s. For simplicity, we will assume S and A are discrete and not time dependent.At any given state, an action can be performed deterministically (using a function mapping states to actions) or stochastically (randomly). When choosing an action stochastically we will define a distribution q(a) for every ${a}\in{A_{s}}$.

Yishay Mansour