Of the data, by the data, for the data: an algorithmic viewpoint

Organiser:
Vinod M. Prabhakaran
Date:
Tuesday, 9 Jan 2024, 16:00 to 17:00
Venue:
A-201 (STCS Seminar Room)
Category:
Abstract

Ranging from arts to science, in disciplines traditionally considered as bastions of human ingenuity, data-driven algorithms have led remarkable breakthroughs in recent years through ChatGPT (natural languages), AlphaGo (game playing), and AlphaFold (biology). With their ever growing prominence and ubiquity, there is a growing consensus that the need of the hour is a fundamental understanding of the success and pitfalls of these algorithms. To this end, my research adopts an interdisciplinary approach to design mathematical foundations and principled algorithms for problems of great practical relevance. While advancing the success frontiers of data-driven methods, this approach offers a unique mathematical lens to study and understand them.

In this talk, I will present my contributions along these themes in the fields of information theory, machine learning, and optimization. Through our work on KO codes, I will demonstrate how data-driven algorithms can discover state-of-the-art codes for wireless communication, a fundamental problem at the heart of information and coding theory. This research highlights the great potential these methods hold for the design of next generation communication systems. Next, I will present our novel algorithmic contribution in optimal transport where we design an efficient and reliable algorithm to learn the optimal transport map between two distributions. These key ideas have broad applications in biology and medicine including cell perturbation analysis and drug discovery. Finally, I will present our ongoing work on designing a solid set of theoretical and algorithmic tools to study large language models (LLMs) and transformers. Despite their impressive performance, our understanding of these models is still in infancy and my main goal here is to develop new insights into their inner workings and faster and efficient training algorithms. I will conclude with my broader research vision in the realm of data science.

 

Ashok is a postdoctoral researcher at EPFL with Michael Gastpar. He obtained his PhD in ECE from the University of Illinois at Urbana-Champaign in August 2022, with Pramod Viswanath and Sewoong Oh. He obtained his Masters in ECE with Yihong Wu also from UIUC in 2017. Earlier he graduated from IIT Bombay with a B.Tech. in EE and Minors in Mathematics working with Vivek Borkar. His research interests are in foundations of data science in topics including machine learning, information theory, optimization, and statistics. He is a recipient of Best Paper Award from ACM MobiHoc 2019. He is also a recipient of several graduate student awards and fellowships including Joan and Lalit Bahl Fellowship (twice), Sundaram Seshu International Student Fellowship, finalist for the Qualcomm Innovation Fellowship 2018.