| Date |
Topic |
Homework |
Slides |
Supplemental Material |
| 01/07/26 |
Introduction |
Read chapter 1 |
intro |
SB chapter 1 |
| 01/12/26 |
Probability and Simple Decisions |
Read chapters 2 and 6 |
probabilitySimpleDecisions |
Foundations of Computer Science Sections 4.9-4.12 |
| 02/14/26 |
Simple Decisions |
|
|
|
| 02/21/26 |
Algorithms for MDPs |
Read chapter 7 |
MDPs |
SB chapters 3 and 4 |
|
Approximations and Search |
Read chapters 8 and 9 |
|
Stable Function Approximation in Dynamic Programming |
|
Model Free RL |
Read chapter 17 |
|
SB chapter 6 |
|
Advanced Model Free RL |
Read Human Level Control Through Deep Reinforcement Learning |
|
SB Chapter 9David Silver's Slides |
|
Bandits |
Read chapter 15
HW2 assigned, due 3/12/24 |
|
SB Chapter 2
Introduction to Multi-Armed Bandits by Aleksandrs Slivkins |
|
Model Based Reinforcement Learning |
Read chapter 16 |
|
|
|
Sarsa, Lambda |
Read chapters 10 and 11 |
|
SB Chapters 7,12 |
|
Policy Search |
|
|
|
|
Policy Gradient |
Read chapters 12 and 13 |
|
|
| 03/14/24 |
Finish Policy Gradient, Review Linear Programs |
|
|
|
| 03/19/24 |
Learning From Demonstration |
Read Chapter 18 |
|
|
|
Reproducibility, Shaping and Catch up |
Read chapture 17.5,
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping,
|
|
Potential Shaping and Q-value Initialization are Equivalent
Measuring the Reliability of Reinforcement Learning Algorithms |
|
Hidden Markov Models and Particle Filters |
Read chapter 19 |
|
|
|
POMDP basics |
Read chapters 19 and 20 |
|
POMDPs for Dummies |
|
POMDPs (approximate solutions) |
Read chapters 21, 22 and 23 |
|
|
|
Matrix Games |
Read chapter 24
| |
|
|
Markov Games |
Read chapter 26, 27 |
|
|
|
Catch up, extra topics (abstraction, hierarchy, etc.), projects |
|
|
|