Title: KL-regularized reinforcement-learning problems
Abstract: Of the many justifications for regularizing reinforcement-learning problems with KL-divergence terms, perhaps the most obviously compelling is when it leads to efficient algorithms. This is the case under the assumptions of full observability and controllability, as in Emo Todorov's work on Linearly-Solvable Markov Decision Processes. In this talk I will present these ideas, mostly introduced in these two papers:
Then I will share insights and challenges in applying similar approaches to partially observable and controllable MDPs.
Post a Comment
Note: Only a member of this blog may post a comment.