Toward Large Kernel Models

Speaker:

Parthe Pandit

Organiser:

Vinod M. Prabhakaran

Date:

Tuesday, 5 Sep 2023, 16:00 to 17:00

Venue:

A201

Category:

STCS Seminar

event.ics Print

Abstract

Recent studies indicate that kernel machines can often perform similarly or better than deep neural networks (DNNs) on small datasets. The interest in kernel machines has been additionally bolstered by the discovery of their equivalence to wide neural networks in certain regimes. However, a key feature of DNNs is their ability to scale the model size and training data size independently, whereas in traditional kernel machines model size is tied to data size. Because of this coupling, scaling kernel machines to large data has been computationally challenging. In this paper, we provide a way forward for constructing large-scale general kernel models, which are a generalization of kernel machines that decouples the model and data, allowing training on large datasets. Specifically, we introduce EigenPro 3.0, an algorithm based on projected dual preconditioned SGD and show scaling to model and data sizes which have not been possible with existing kernel methods.
Parthe is a Simons postdoctoral fellow with the Halıcıoğlu Data Science Institute at UCSD. He obtained his Ph.D. in ECE from UCLA, and his undergrad degree in EE from IIT Bombay. He has been a recipient of the Jack K. Wolf student paper award at ISIT 2019.