BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1255
DTSTAMP:20230914T125956Z
SUMMARY:Bandits with Heavy Tails: Algorithms\, Analysis and Optimality
DESCRIPTION:Speaker: Shubhada Agrawal\n\nAbstract: \nMulti-armed bandit (MA
B) is a popular framework for sequential decision-making in an uncertain e
nvironment. In the classical setup of MAB\, the algorithm has access to a
fixed and finite set K of unknown\, independent probability distributions
or arms. At each time step\, having observed the outcomes of all the previ
ous actions\, the algorithm chooses one of the K arms and receives an inde
pendent sample drawn from the underlying distribution\, which may be consi
dered a reward. The algorithm's goal is either to maximize the accumulated
rewards or to identify the best arm in as few samples as possible for an
appropriate notion of best.\nVariants of these classical formulations have
been widely studied in the literature. Tight lower bounds and optimal alg
orithms have been developed under the assumption that the arm distribution
s are either Sub-Gaussian or come from a single parameter exponential fami
ly (SPEF)\, for example\, Gaussian distributions with a known variance or
Bernoulli distributions. However\, in practice\, these distributional assu
mptions may not hold. Developing lower bounds and optimal algorithms for t
he general distributions largely remained open\, mainly because of the nee
d for new tools for the analysis in this generality.\nIn this dissertation
\, we undertake a detailed study of the MAB problems allowing for all the
distributions with a known uniform bound on their (1+\\epsilon)^{th} momen
ts for some $\\epsilon > 0$. This class subsumes a large class of heavy-ta
iled distributions. We develop a framework with essential tools and concen
tration inequalities and use it to design optimal algorithms for three key
variants of the MAB problem\, including the classical frameworks of regre
t minimization and best-arm identification.\nA key component of designing
an optimal algorithm for MAB is constructing tight\, anytime valid confide
nce intervals (CIs) for mean. We develop new concentration inequalities to
this end\, which may be of independent interest.\nThe above results were
obtained in collaborations involving Sandeep Juneja\, Wouter M. Koolen and
Peter Glynn.\n\nZoom link: https://zoom.us/j/94294491152?pwd=MUZiNkk5Sy9
2Z3VEc3laZUNCTGdDdz09\n
URL:https://www.tcs.tifr.res.in/web/events/1255
DTSTART;TZID=Asia/Kolkata:20221209T113000
DTEND;TZID=Asia/Kolkata:20221209T123000
LOCATION:A201
END:VEVENT
END:VCALENDAR