1 
Markov Decision Processes
FiniteHorizon Problems: Backwards Induction
DiscountedCost Problems: CosttoGo Function, Bellman's Equation 

2 
Value Iteration
Existence and Uniqueness of Bellman's Equation Solution
GaussSeidel Value Iteration 

3 
Optimality of Policies derived from the Costtogo Function
Policy Iteration
Asynchronous Policy Iteration 
Problem set 1 out 
4 
AverageCost Problems
Relationship with DiscountedCost Problems
Bellman's Equation
Blackwell Optimality 
Problem set 1 due 
5 
AverageCost Problems
Computational Methods 

6 
Application of Value Iteration to Optimization of Multiclass Queueing Networks
Introduction to Simulationbased Methods RealTime Value Iteration 
Problem set 2 out 
7 
QLearning
Stochastic Approximations 

8 
Stochastic Approximations: Lyapunov Function Analysis
The ODE Method
Convergence of QLearning 

9 
Exploration versus Exploitation: The Complexity of Reinforcement Learning 

10 
Introduction to Value Function Approximation
Curse of Dimensionality
Approximation Architectures 

11 
Model Selection and Complexity 
Problem set 3 out 
12 
Introduction to Value Function Approximation Algorithms
Performance Bounds 

13 
TemporalDifference Learning with Value Function Approximation 

14 
TemporalDifference Learning with Value Function Approximation (cont.) 

15 
TemporalDifference Learning with Value Function Approximation (cont.)
Optimal Stopping Problems
General Control Problems 

16 
Approximate Linear Programming 
Problem set 4 out 
17 
Approximate Linear Programming (cont.) 

18 
Efficient Solutions for Approximate Linear Programming 

19 
Efficient Solutions for Approximate Linear Programming: Factored MDPs 

20 
Policy Search Methods 
Problem set 5 out 
21 
Policy Search Methods (cont.) 

22 
Policy Search Methods for POMDPs
Application: Call Admission Control
ActorCritic Methods 

23 
Guest Lecture: Prof. Nick Roy
Approximate POMDP Compression 

24 
Policy Search Methods: PEGASUS
Application: Helicopter Control 
