BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1202
DTSTAMP:20230914T125954Z
SUMMARY:A Simple Convergence Proof for A Simple Convergence Proof Stochasti
c Approximation and Applications to Reinforcement Learning
DESCRIPTION:Speaker: M. Vidyasagar (Indian Institute of Technology Hyderaba
d.)\n\nAbstract: \nSince its invention by Robbins and Monro in 1951\, the
stochastic approximation (SA) algorithm has been a widely used tool for fi
nding solutions of equations\, or minimizing functions\, with noisy measur
ements. Current methods for proving its convergence make use of the "ODE"
method whereby the sample paths of the algorithm are approximated by the t
rajectories of an associated ODE. This method requires a lot of technicali
ties. Interestingly\, as far back as 1965\, there was a paper by Gladyshev
that gave a simple convergence proof based on martingale methods\; howeve
r\, this proof worked for only a class of problems. In this talk I will co
mbine martingale methods with a new "converse theorem" for Lyapunov stabil
ity\, to arrive at a simple proof that works for the same situations where
the ODE method applies. The advantage of this approach is that it can pot
entially be applied to several problems in Reinforcement Learning (RL)\, s
uch as actor-critic learning (which is two time-scale SA)\, or RL with val
ue approximation (which is SA with projections onto a lower-dimensional su
bspace). These directions are under investigation.\nZoom Link - https://z
oom.us/j/91983281364?pwd=Wkl3MHMzWUFiYnVhV1d1U1E3bXhpZz09\n
URL:https://www.tcs.tifr.res.in/web/events/1202
DTSTART;TZID=Asia/Kolkata:20220510T160000
DTEND;TZID=Asia/Kolkata:20220510T170000
LOCATION:In person @ A-201 and also via Zoom
END:VEVENT
END:VCALENDAR