Tata Institute of Fundamental Research

Off-policy evaluation in Reinforcement Learning using Linear Regression

STCS Student Seminar

Speaker:	Anirban Bhattacharjee
Organiser:	Ashutosh Shankar
Date:	Friday, 11 Dec 2020, 17:15 to 18:15
Venue:

(Scan to add to calendar)

Abstract: In Reinforcement Learning, one often needs to evaluate a given policy using rewards observed by following another policy. This is called off-policy evaluation in Learning Theory parlance. The traditional methods for off-policy evaluation involve importance sampling, which comes with certain drawbacks. We shall look at these drawbacks and how linear regression may be used instead to overcome the same.

Zoom link: https://zoom.us/j/98132227553?pwd=K2cyQllKVjExdUhlRm0vc0ZHcEt0Zz09