Markov chains let us reason about the behavior of ``dynamical'' systems where no decisions take place.
Markov decision processes have a single agent, which tries to maximize its expected reward.
A further generalization is a zero-sum Markov game, in which there are two agents, one tries to maximize reward while the other tries to minimize. We won't be studying that case explicitly in this class.