Identifying Low-dimensional Data in High-dimensional Spaces


Anindya De


Northwestern University
Technological Institute
2145 Sheridan Road
Evanston, IL 60208


Tuesday, 11 December 2018, 14:00 to 15:00


  • A-201 (STCS Seminar Room)


Abstract: Motivated by the problem of feature selection in machine learning, the problem of testing juntas, i.e., checking if a Boolean function on the n-dimensional hypercube only depends on k<<n coordinates, has attracted a lot of attention in theoretical computer science. However, in many settings, there is no obvious choice of  a basis and a more  meaningful question is to ask if a function only depends a k-dimensional subspace. We show that while such "linear juntas" are not testable with any finite number of queries, assuming an upper bound of s on their surface area, such functions can tested with poly(k,s) queries, i.e., independent of the ambient dimension n. We also show a poly(s) lower bound on the query complexity of any non-adaptive tester for linear-juntas showing that the dependence on s is tight up to polynomial factors. As a consequence of our upper bound, we show that intersections of a constant number of halfspaces (as well as several related concepts) are testable with constant query complexity [joint work with Elchanan Mossel (MIT) and Joe Neeman (UT Austin)].