2008/11/11 19:49

Decision Making in Large Scale Systems

2.997 Decision Making in Large Scale Systems - Spring 2004

Instructor: Prof. Daniela Pucci De Farias

Tetris game strategies can be analyzed using the value function approximation techniques described in this course -- see lecture session 10.  (Illustration courtesy of OCW.)

Course Description

This course is an introduction to the theory and application of large-scale dynamic programming. Topics include Markov decision processes, dynamic programming algorithms, simulation-based algorithms, theory and algorithms for value function approximation, and policy search methods. The course examines games and applications in areas such as dynamic resource allocation, finance and queueing networks.




Lecture Notes

LEC # TOPICS LECTURE NOTES
1 Markov Decision Processes

Finite-Horizon Problems: Backwards Induction

Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation
(PDF)
2 Value Iteration

Existence and Uniqueness of Bellman's Equation Solution

Gauss-Seidel Value Iteration
(PDF)
3 Optimality of Policies derived from the Cost-to-Go Function

Policy Iteration

Asynchronous Policy Iteration
(PDF)
4 Average-Cost Problems

Relationship with Discounted-Cost Problems

Bellman's Equation

Blackwell Optimality
(PDF)
5 Average-Cost Problems

Computational Methods
(PDF)
6 Application of Value Iteration to Optimization of Multiclass Queueing Networks

Introduction to Simulation-based Methods Real-Time Value Iteration
(PDF)
7 Q-Learning

Stochastic Approximations
(PDF)
8 Stochastic Approximations: Lyapunov Function Analysis

The ODE Method

Convergence of Q-Learning
(PDF)
9 Exploration versus Exploitation: The Complexity of Reinforcement Learning (PDF)
10 Introduction to Value Function Approximation

Curse of Dimensionality

Approximation Architectures
(PDF)
11 Model Selection and Complexity (PDF)
12 Introduction to Value Function Approximation Algorithms

Performance Bounds
(PDF)
13 Temporal-Difference Learning with Value Function Approximation (PDF)
14 Temporal-Difference Learning with Value Function Approximation (cont.) (PDF)
15 Temporal-Difference Learning with Value Function Approximation (cont.)

Optimal Stopping Problems

General Control Problems
(PDF)
16 Approximate Linear Programming (PDF)
17 Approximate Linear Programming (cont.) (PDF)
18 Efficient Solutions for Approximate Linear Programming (PDF)
19 Efficient Solutions for Approximate Linear Programming: Factored MDPs (PDF)
20 Policy Search Methods (PDF)
21 Policy Search Methods (cont.) (PDF)
22 Policy Search Methods for POMDPs

Application: Call Admission Control

Actor-Critic Methods

23 Approximate POMDP Compression
24 Policy Search Methods: PEGASUS

Application: Helicopter Control


Trackback 0 Comment 0