AI Seminar: Probabilistic Inference in Reinforcement Learning Done Right

Event Speaker

Jean Tarbouriech

Event Speaker Description

Research Scientist

Google DeepMind, London

Event Type

Artificial Intelligence

Date

Friday, Apr-05-2024 2:00 PM Friday, Apr-05-2024 3:30 PM

Event Location

BEXL 320 and Zoom

Event Description

Zoom: https://oregonstate.zoom.us/j/91611213801?pwd=Wm9JSkN1eW84RUpiS2JEd0E5T…

‘RL as Inference’ is a popular but flawed perspective. In this talk, we empower it with a principled, Bayesian treatment that yields efficient exploration. We first clarify how control and statistical inference - the 2 facets of RL - can be distilled into a single quantity, PΓ*, the posterior probability of each state-action being visited by the optimal policy. Previous approaches approximate PΓ* in an arbitrarily poor way that does not perform well in challenging problems. We prove that PΓ* can be used to generate a policy that explores efficiently, as measured by regret, although computing it is intractable. We thus derive a new variational Bayesian approximation yielding a tractable convex optimization problem and establish that the resulting policy also explores efficiently. We call our approach VAPOR and show that it has strong connections to Thompson sampling, K-learning, and maximum entropy exploration. We conclude with some experiments demonstrating the performance advantage of a deep RL version of VAPOR.

Speaker Biography

Jean Tarbouriech is a Research Scientist at Google DeepMind, London. His main research interest is Reinforcement Learning, with a focus on efficient exploration to improve RL agents and large language models. He obtained his PhD from Inria Lille and Meta AI Paris.

Video Recording

Probabilistic Inference in Reinforcement Learning Done Right

Contact Info

Social Media

Contact Webmaster