BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1479
DTSTAMP:20240912T090252Z
SUMMARY:Transfer Q*: Principled Decoding for LLM Alignment
DESCRIPTION:Speaker: Amrit Singh Bedi (University of Central Florida)\n\nAb
 stract: \nTraditional fine-tuning of foundation models is computationally 
 heavy\, involving updates to billions of parameters. A promising alternati
 ve\, alignment via decoding\, adjusts the response distribution directly 
 without model updates to maximize a target reward r\, thus providing a lig
 htweight and adaptable framework for alignment. However\, principled deco
 ding methods rely on oracle access to an optimal Q-function (Q*)\, which 
 is often unavailable in practice. We propose Transfer Q*\, which implicitl
 y estimates the optimal value function for a target reward through a basel
 ine model aligned with a baseline reward rBL (which can be different from 
 the target reward). Our approach significantly reduces the sub-optimality 
 gap observed in prior SoTA methods and demonstrates superior empirical per
 formance across key metrics such as coherence\, diversity\, and quality in
  extensive tests on several synthetic and real datasets.\nShort Bio:\nAmri
 t Singh Bedi is an assistant professor in the Computer Science department 
 at the University of Central Florida\, Fl\, USA. Before that\, He was a r
 esearch assistant professor in the Computer Science Department at the Uni
 versity of Maryland\, College Park\, MD\, USA. He obtained his Ph.D. in El
 ectrical Engineering from IIT Kanpur\, Kanpur\, India\, in 2018. Following
  his doctoral studies\, he worked as a Research Associate within the Compu
 tational and Information Sciences Directorate at the US Army Research Labo
 ratory (ARL) in Adelphi\, MD\, USA\, from 2019 to 2022. His research inter
 ests lie in artificial intelligence (AI) for autonomous systems\, with spe
 cific emphasis on scalable & sample-efficient learning algorithms. Current
 ly\, he is working on the problem of AI alignment in language models.  Hi
 s paper was selected as one of the Best Paper Finalists at the 2017 IEEE A
 silomar Conference on Signals\, Systems\, and Computers. He received an ho
 norable mention from the IEEE Robotics and Automation Letters in 2020. He 
 was awarded the Amazon Research Award in 2022.\n
URL:https://www.tcs.tifr.res.in/web/events/1479
DTSTART;TZID=Asia/Kolkata:20241022T160000
DTEND;TZID=Asia/Kolkata:20241022T170000
LOCATION:via Zoom in A201
END:VEVENT
END:VCALENDAR