BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1007
DTSTAMP:20230914T125946Z
SUMMARY:Multi-Armed Bandits with Non-Stationary Rewards
DESCRIPTION:Speaker: Anirban Bhattacharjee\n\nAbstract: \nAbstract: In a mu
lti-armed bandit problem a gambler needs to choose at each round one of K
arms\, each characterized by an unknown reward distribution. The objective
is to maximize cumulative expected reward over a time horizon T\, with pe
rformance being measured in terms of regret with respect to a static oracl
e that knows the best arm a priori.\nThis problem has been studied extensi
vely when the reward distributions do not change over time. However\, we s
hall look at the case when the reward distributions undergo changes over t
ime with a fixed budget on the overall change. The talk will include resul
ts on lower bounds on regret with respect to a dynamic oracle that knows t
he best arm in each round\, and will look at two algorithms that are known
to be effective in this setting\, both exceeding the lower bound by a log
arithmic factor.\n
URL:https://www.tcs.tifr.res.in/web/events/1007
DTSTART;TZID=Asia/Kolkata:20191018T171500
DTEND;TZID=Asia/Kolkata:20191018T181500
LOCATION:A-201 (STCS Seminar Room)
END:VEVENT
END:VCALENDAR