A sparsification of a structure, with respect to a class of queries, produces a compressed representation of the structure while answering every query in the class approximately correctly. The seminal example of sparsification is "graph sparsification with respect to cut queries", due to works of Karger and Benczur and Karger from the 1990s, showing that every graph can be compressed to near-linear size (in the number of nodes in the graph) while approximately capturing the size of every cut in the graph. In 2015 Kogan and Krauthgamer generalized the notion of sparsification to all Constraint Satisfaction Problems (CSPs) --- here the structure to be compressed is an instance of the CSP on n variables and a query is an assignment to the n variables, and the goal thus is to approximately determine the number of constraints satisfied by the queried assignment. Several follow up works gave some non-trivial CSPs that allow for near linear (in n) sparsifications, including classification of all binary CSPs (where each constraint applies to two variables) that allow such sparsification, but the general picture seemed wide open.
In our works we introduce a new class of sparsification problems, namely code sparsification, where the structure to be preserved is a linear error correcting code; the query is a message, and the goal is to compute the approximate weight of the encoding of the message. We show that this class of problems gives the right language to abstract the techniques of Karger and Benczur and Karger --- and indeed all codes can be sparsified to length nearly linear in the number of message bits. This generalization already resolves some basic questions in CSP sparsification. A further generalization to additive codes over finite abelian groups gives even powerful results and in particular completely classifies the class of symmetric Boolean CSPs that allow nearly linear sized sparsification. (Prior to our work even the case of sparsification of 3-XOR constraints was open.) A key question left open by our work is the algorithmic challenge of finding the sparsification efficiently --- our work only proves the existence of these.
Based on joint works with Sanjeev Khanna (U. Penn.) and Aaron (Louie) Putterman (Harvard).