BEGIN:VCALENDAR
PRODID:-//eluceo/ical//2.0/EN
VERSION:2.0
CALSCALE:GREGORIAN
BEGIN:VEVENT
UID:www.tcs.tifr.res.in/event/1084
DTSTAMP:20230914T125949Z
SUMMARY:Mathematics of Neural Nets
DESCRIPTION:Speaker: Anirbit Mukherjee (Wharton School of the University of
Pennsylvania\nPennsylvania\, United States.)\n\nAbstract: \nOne of the pa
ramount mathematical mysteries of our times is to be able to explain the p
henomenon of deep-learning i.e training neural nets. Neural nets can be ma
de to paint while imitating classical art styles or play chess better than
any machine or human ever and they seem to be the closest we have ever co
me to achieving "artificial intelligence". But trying to reason about thes
e successes quickly lands us into a plethora of extremely challenging math
ematical questions - typically about discrete stochastic processes. Some o
f these questions remain unsolved for even the smallest neural nets! In th
is talk we will give a brief overview of the major themes of our work in t
his direction in the last few years.\n \nFirstly we will give highlights o
f some of our major depth hierarchy theorems and landscape results about
neural nets. Then we will explain how for certain nets under mild distribu
tional conditions our iterative algorithms like ``Neuro-Tron"\, which do n
ot use a gradient oracle\, can be proven to train nets in the infinity-nor
m loss - using as much time/sample complexity as expected from gradient ba
sed methods but in regimes where usual algorithms like (S)GD remain unprov
en. Our theorems include the particularly challenging regime of dealing wi
th non-realizable data while the net is of finite size. Next we will brief
ly look at our first-of-its-kind results about sufficient conditions for f
ast convergence of a standard adaptive gradient deep-learning algorithm\,
the RMSProp. \n\nIn the second half of the talk\, we will focus on the rec
ent rise of the PAC-Bayesian technology in being able to explain the low r
isk of certain over-parameterized nets on standardized tests. We will pres
ent our recent results in this domain which give bounds which empirically
supersede some of the existing theoretical benchmarks in this field and th
is we achieve via our new proofs about the key property of noise resilienc
e of nets. \n \nThis is joint work with Amitabh Basu (JHU)\, Ramchandran M
uthukumar (JHU)\, Jiayao Zhang (UPenn)\, Dan Roy (UToronto\, Vector Instit
ute)\, Pushpendre Rastogi (JHU\, Amazon)\, Soham De (DeepMind\, Google)\,
Enayat Ullah (JHU)\, Jun Yang (UToronto\, Vector Institute) and Anup Rao (
Adobe).\n \nBio: Anirbit Mukherjee finished his Ph.D. in applied mathemati
cs at the Johns Hopkins University advised by Prof. Amitabh Basu. He is no
w a post-doc at Wharton (UPenn)\, Statistics with Prof. Weijie Su. He spec
ializes in deep-learning theory and has been awarded 2 fellowships from JH
U for this research - the Walter L. Robb Fellowship and the inaugural Math
ematical Institute for Data Science Fellowship. Earlier\, he was a researc
her in Quantum Field Theory\, while doing his undergrad in physics at the
Chennai Mathematical Institute (CMI) and masters in theoretical physics at
the Tata Institute of Fundamental research (TIFR).\n
URL:https://www.tcs.tifr.res.in/web/events/1084
DTSTART;TZID=Asia/Kolkata:20200922T160000
DTEND;TZID=Asia/Kolkata:20200922T170000
LOCATION:Zoom meeting: https://zoom.us/j/97246120231?pwd=OGhsUTY4Unpyblkr
cUxHMnlvbGxmdz09
END:VEVENT
END:VCALENDAR