2.997 Decision Making in Large Scale Systems - Spring 2004
Instructor: Prof. Daniela Pucci De Farias
Tetris game strategies can be analyzed using the value function
approximation techniques described in this course -- see lecture
session 10. (Illustration courtesy of OCW.)
Course Description
This
course is an introduction to the theory and application of large-scale
dynamic programming. Topics include Markov decision processes, dynamic
programming algorithms, simulation-based algorithms, theory and
algorithms for value function approximation, and policy search methods.
The course examines games and applications in areas such as dynamic
resource allocation, finance and queueing networks.
Lecture Notes
| LEC # | TOPICS | LECTURE NOTES |
|---|---|---|
| 1 | Markov Decision Processes Finite-Horizon Problems: Backwards Induction Discounted-Cost Problems: Cost-to-Go Function, Bellman's Equation |
(PDF) |
| 2 | Value Iteration Existence and Uniqueness of Bellman's Equation Solution Gauss-Seidel Value Iteration |
(PDF) |
| 3 | Optimality of Policies derived from the Cost-to-Go Function Policy Iteration Asynchronous Policy Iteration |
(PDF) |
| 4 | Average-Cost Problems Relationship with Discounted-Cost Problems Bellman's Equation Blackwell Optimality |
(PDF) |
| 5 | Average-Cost Problems Computational Methods |
(PDF) |
| 6 | Application of Value Iteration to Optimization of Multiclass Queueing Networks Introduction to Simulation-based Methods Real-Time Value Iteration |
(PDF) |
| 7 | Q-Learning Stochastic Approximations |
(PDF) |
| 8 | Stochastic Approximations: Lyapunov Function Analysis The ODE Method Convergence of Q-Learning |
(PDF) |
| 9 | Exploration versus Exploitation: The Complexity of Reinforcement Learning | (PDF) |
| 10 | Introduction to Value Function Approximation Curse of Dimensionality Approximation Architectures |
(PDF) |
| 11 | Model Selection and Complexity | (PDF) |
| 12 | Introduction to Value Function Approximation Algorithms Performance Bounds |
(PDF) |
| 13 | Temporal-Difference Learning with Value Function Approximation | (PDF) |
| 14 | Temporal-Difference Learning with Value Function Approximation (cont.) | (PDF) |
| 15 | Temporal-Difference Learning with Value Function Approximation (cont.) Optimal Stopping Problems General Control Problems |
(PDF) |
| 16 | Approximate Linear Programming | (PDF) |
| 17 | Approximate Linear Programming (cont.) | (PDF) |
| 18 | Efficient Solutions for Approximate Linear Programming | (PDF) |
| 19 | Efficient Solutions for Approximate Linear Programming: Factored MDPs | (PDF) |
| 20 | Policy Search Methods | (PDF) |
| 21 | Policy Search Methods (cont.) | (PDF) |
| 22 | Policy Search Methods for POMDPs Application: Call Admission Control Actor-Critic Methods |
|
| 23 | Approximate POMDP Compression | |
| 24 | Policy Search Methods: PEGASUS Application: Helicopter Control |

이올린에 북마크하기
Prev
