Calendar

LEC # TOPICS KEY DATES
1 Markov Decision Processes

Finite-Horizon Problems: Backwards Induction

Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation
2 Value Iteration

Existence and Uniqueness of Bellman's Equation Solution

Gauss-Seidel Value Iteration
3 Optimality of Policies derived from the Cost-to-go Function

Policy Iteration

Asynchronous Policy Iteration
Problem set 1 out
4 Average-Cost Problems

Relationship with Discounted-Cost Problems

Bellman's Equation

Blackwell Optimality
Problem set 1 due
5 Average-Cost Problems

Computational Methods
6 Application of Value Iteration to Optimization of Multiclass Queueing Networks

Introduction to Simulation-based Methods Real-Time Value Iteration
Problem set 2 out
7 Q-Learning

Stochastic Approximations
8 Stochastic Approximations: Lyapunov Function Analysis

The ODE Method

Convergence of Q-Learning
9 Exploration versus Exploitation: The Complexity of Reinforcement Learning
10 Introduction to Value Function Approximation

Curse of Dimensionality

Approximation Architectures
11 Model Selection and Complexity Problem set 3 out
12 Introduction to Value Function Approximation Algorithms

Performance Bounds
13 Temporal-Difference Learning with Value Function Approximation
14 Temporal-Difference Learning with Value Function Approximation (cont.)
15 Temporal-Difference Learning with Value Function Approximation (cont.)

Optimal Stopping Problems

General Control Problems
16 Approximate Linear Programming Problem set 4 out
17 Approximate Linear Programming (cont.)
18 Efficient Solutions for Approximate Linear Programming
19 Efficient Solutions for Approximate Linear Programming: Factored MDPs
20 Policy Search Methods Problem set 5 out
21 Policy Search Methods (cont.)
22 Policy Search Methods for POMDPs

Application: Call Admission Control

Actor-Critic Methods
23 Guest Lecture: Prof. Nick Roy

Approximate POMDP Compression
24 Policy Search Methods: PEGASUS

Application: Helicopter Control