Monday, June 24, 2013

Roy Fox: June 26th

Title: KL-regularized reinforcement-learning problems

Abstract: Of the many justifications for regularizing reinforcement-learning problems with KL-divergence terms, perhaps the most obviously compelling is when it leads to efficient algorithms. This is the case under the assumptions of full observability and controllability, as in Emo Todorov's work on Linearly-Solvable Markov Decision Processes. In this talk I will present these ideas, mostly introduced in these two papers:
http://homes.cs.washington.edu/~todorov/papers/MDP.pdf
http://homes.cs.washington.edu/~todorov/papers/duality.pdf
Then I will share insights and challenges in applying similar approaches to partially observable and controllable MDPs.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.