BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1566
DTSTAMP:20250611T112615Z
SUMMARY:Sequential Decision-Making under Uncertainty and Constraints: A Syn
 thesis of Differential Games and Stochastic Bandits
DESCRIPTION:Speaker: Sushant Vijayan (TIFR)\n\nAbstract: \nIn this thesis\,
  we study sequential decision making problems under uncertainty and constr
 aints. We consider four different problems:\nFor online regret minimizatio
 n in linear bandits with ${prior}$ observations\, we propose a phased elim
 ination algorithm ${OOPE}$ and give its regret upper bound in terms of the
  offline data's Grammian eigenspectrum. We establish regret lower bounds t
 hat help establish the minimax optimality of OOPE in certain regimes and i
 mprove on existing work. This also shows that the quality of offline data 
 is well measured by the eigenspectrum of the Grammian matrix of the offlin
 e data.\nIn automated bidding systems with ${unknown value}$ and long-term
  budget and Return-on-Spend (RoS) constraints\, a UCB-based computationall
 y efficient algorithm is developed. We do away with restrictive assumption
 s like strict Slater feasibility\, truthfulness of the auction\,  and obt
 ain optimal regret and constraint violation bounds with logarithmic depend
 ence on the bid domain size.\nWe model infectious disease spread as a diff
 erential game between a planner and the populace\, characterizing ${open-l
 oop}$ Nash equilibria using Pontryagin's Minimum Principle. We use the dev
 eloped model to study the qualitative characteristics in equilibrium and b
 ring out the crucial roles of infection detection rates and public trust i
 n reported infection numbers. \nFor Active Simple Hypothesis Testing (ASH
 T) with a fixed budget\, we characterize the minimax error exponent as the
  value of a zero-sum differential game. This reformulation leads to a more
  computationally tractable algorithm compared to prior work. This problem 
 has attracted significant interest in many different research communities 
 like Simulation\, Operation Research\, Information Theory and Bandits. Thi
 s characterization of the error exponent and the development of provably o
 ptimal\, computationally tractable algorithm constitute significant progre
 ss into the problem. However\, even this provably optimal algorithm suffer
 s from a \\emph{curse of dimensionality}. We propose a more efficient algo
 rithm leveraging a novel link to Blackwell Approachability. This more effi
 cient approachability algorithm provably outperforms non-adaptive strategi
 es and is numerically verified to attain the optimal exponent in certain i
 nstances.\nThe first two problems utilize only bandit techniques and the t
 hird employs methods of differential games\; the final problem combines di
 fferential game theory with a classical bandit setup\, showcasing a ${synt
 hesis}$ of both distinct and integrated applications of these two framewor
 ks.\n
URL:https://www.tcs.tifr.res.in/web/events/1566
DTSTART;TZID=Asia/Kolkata:20250613T153000
DTEND;TZID=Asia/Kolkata:20250613T163000
LOCATION:via Zoom in A201
END:VEVENT
END:VCALENDAR
