|
Overview
The course will focus on the principles and practice of decision making for autonomous agents, and robots in particular. We will cover rationality, decision theory, probabilistic reasoning, multi-arm bandits, Markov decision processes, partially observable MDPs, belief-space learning and planning, inverse reinforcement learning, and learning from demonstration.
Instructor
George Konidaris
Office: LSRC D224
Email: gdk at cs dot duke dot edu
[Back to top]
Prerequisites
There are no formal pre-requisites.
However, note that it is a graduate class, so it will assume that you are familiar
with the necessary mathematics (this means probability, linear algebra, and multivariable calculus), and enough background in AI to be able to make a good attempt at reading research papers on these topics.
[Back to top]
Schedule
The first class is on August 25th. The class meets on Tuesdays and Thursdays
from 3:05pm - 4:20pm, in Allen 318.
Date | Topic | Slides and Readings |
August 25th |
Introduction: Agents, Robots, Models, and Rationality |
Slides |
August 27th |
Probabilistic Reasoning |
Slides |
September 1st |
Utility Theory |
Slides |
September 3rd |
Multi-arm Bandits |
Slides |
September 8th |
Contextual Bandits |
Slides
Li et al.: Contextual Bandits for News Article Recommendations |
September 10th |
Markov Decision Processes |
Slides |
September 15th |
Reinforcement Learning |
Slides |
September 17th |
No class |
Sutton and Barto, Chapters 3-8 |
September 22nd |
Reinforcement Learning II |
Slides |
September 24th |
Reinforcement Learning III |
Slides |
September 29th |
Reinforcement Learning III (Policy Search) |
Slides |
October 1st |
Hierarchical RL |
Slides
Sutton, Precup, and Singh, 1999 |
October 6th |
No class |
|
October 8th |
No class (Fall break) |
|
October 13th |
No class (Fall break) |
|
October 15th |
Review Session |
|
October 20th |
Midterm (ROOM 311, NORTH BUILDING) |
|
October 22nd |
Hierarchical RL (resumed) |
Assignment 1 due (before class)
Slides |
October 27th |
Learning from Demonstration |
Slides |
October 29th |
No class |
|
November 3rd |
Inverse Reinforcement Learning |
Slides
Abbeel and Ng, Inverse RL, ICML 2004. |
November 5th |
Partially observable MDPs |
Slides |
November 10th |
Kalman Filters |
Slides
An Introduction to the Kalman Filter, Welch and Bishop |
November 12th |
Belief-Space Planning |
Slides |
November 17th |
Solution Methods for POMDPs |
Slides |
November 19th |
Revision |
|
November 24th |
Final day of graduate classes No class (Thanksgiving) |
|
[Back to top]
Assignments
Academic Honesty
- We take academic honesty very seriously. This matrix should leave no ambiguity about what is permitted and what is not permitted.
You should check if you have any confusion about what is permitted.
Lateness policy
- You may request an extension before the due date of the
assignment. Valid reasons for extensions include (but are not
necessarily limited to) interviews, travel
for research or academic purposes, and illness.
- Late assignments (without a previously granted extension) will be
penalized 10% per day. Assignments will not be accepted more than 5
days after the due date.
[Back to top]
Grading
Course evaluation will be as follows:
- Assignment 1 (25%)
- Midterm exam (25%)
- Assignment 2 (25%)
- Final exam (25%)
I expect all Duke students to
conduct themselves with the highest integrity, according to the
Duke Community Standard. If you are unsure what this means,
please refer to this link.
For a more concrete description, this matrix outlines what
forms of collaboration with others are and are not allowed during this
course.
[Back to top]
Resources
A very good introduction to the fundamentals of probability theory:
- Introduction to Probability, Bertsekas and Tsitsiklis.
[Amazon]
A useful guide to utility theory and uncertainty in MDPs:
- Decision Making Under Uncertainty: Theory and Application,
Kochenderfer. [Amazon]
Sutton and Barto: a great introduction to reinforcement learning (chapter 2 is on bandits):
[Back to top]
|