Single Agent Problems

Markov chains let us reason about the behavior of ``dynamical'' systems where no decisions take place.

Markov decision processes have a single agent, which tries to maximize its expected reward.

A further generalization is a zero-sum Markov game, in which there are two agents, one tries to maximize reward while the other tries to minimize. We won't be studying that case explicitly in this class.

Next: Definition Up: Markov Decision Processes Previous: Markov Decision Processes