Source author record

Parikshit Gopalan

Parikshit Gopalan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity Information Theory Machine Learning math.IT Computer Science and Game Theory Cryptography and Security Data Structures and Algorithms Databases Discrete Mathematics math.CO

Catalog footprint

What is connected

18works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Flexible Routing via Uncertainty Decomposition

A key strategy for balancing performance and cost in modern machine learning systems is to dynamically route queries to either a low-cost model or a more expensive oracle (such as a large pretrained model or human expert), an approach known as model routing. In this work we present a new uncertainty-aware router that (1) avoids unnecessary oracle calls on inherently ambiguous queries, and (2) adapts dynamically to different loss functions and cost parameters through simple hyperparameter changes, without retraining. Our method, applicable to any classification setting where multiple independent annotations per input are available, is based on decomposing total uncertainty into irreducible and reducible components using higher-order predictors [Ahdritz et al., 2025]. This enables a unified approach to both routing and abstention: predict with the weak model when uncertainty is low, route to the oracle when reducible uncertainty is high, and abstain when irreducible uncertainty is high. Our router comes with strong theoretical guarantees bounding regret relative to optimal task-specific routers. We conduct experiments on both synthetic and real-world datasets that demonstrate the benefits of our approach in suitable regimes -- in particular, whenever reducible and irreducible uncertainty are not too correlated.

preprint2022arXiv

KL Divergence Estimation with Multi-group Attribution

Estimating the Kullback-Leibler (KL) divergence between two distributions given samples from them is well-studied in machine learning and information theory. Motivated by considerations of multi-group fairness, we seek KL divergence estimates that accurately reflect the contributions of sub-populations to the overall divergence. We model the sub-populations coming from a rich (possibly infinite) family $\mathcal{C}$ of overlapping subsets of the domain. We propose the notion of multi-group attribution for $\mathcal{C}$, which requires that the estimated divergence conditioned on every sub-population in $\mathcal{C}$ satisfies some natural accuracy and fairness desiderata, such as ensuring that sub-populations where the model predicts significant divergence do diverge significantly in the two distributions. Our main technical contribution is to show that multi-group attribution can be derived from the recently introduced notion of multi-calibration for importance weights [HKRR18, GRSW21]. We provide experimental evidence to support our theoretical results, and show that multi-group attribution provides better KL divergence estimates when conditioned on sub-populations than other popular algorithms.

preprint2022arXiv

Low-Degree Multicalibration

Introduced as a notion of algorithmic fairness, multicalibration has proved to be a powerful and versatile concept with implications far beyond its original intent. This stringent notion -- that predictions be well-calibrated across a rich class of intersecting subpopulations -- provides its strong guarantees at a cost: the computational and sample complexity of learning multicalibrated predictors are high, and grow exponentially with the number of class labels. In contrast, the relaxed notion of multiaccuracy can be achieved more efficiently, yet many of the most desirable properties of multicalibration cannot be guaranteed assuming multiaccuracy alone. This tension raises a key question: Can we learn predictors with multicalibration-style guarantees at a cost commensurate with multiaccuracy? In this work, we define and initiate the study of Low-Degree Multicalibration. Low-Degree Multicalibration defines a hierarchy of increasingly-powerful multi-group fairness notions that spans multiaccuracy and the original formulation of multicalibration at the extremes. Our main technical contribution demonstrates that key properties of multicalibration, related to fairness and accuracy, actually manifest as low-degree properties. Importantly, we show that low-degree multicalibration can be significantly more efficient than full multicalibration. In the multi-class setting, the sample complexity to achieve low-degree multicalibration improves exponentially (in the number of classes) over full multicalibration. Our work presents compelling evidence that low-degree multicalibration represents a sweet spot, pairing computational and sample efficiency with strong fairness and accuracy guarantees.

preprint2020arXiv

Overlook: Differentially Private Exploratory Visualization for Big Data

Data exploration systems that provide differential privacy must manage a privacy budget that measures the amount of privacy lost across multiple queries. One effective strategy to manage the privacy budget is to compute a one-time private synopsis of the data, to which users can make an unlimited number of queries. However, existing systems using synopses are built for offline use cases, where a set of queries is known ahead of time and the system carefully optimizes a synopsis for it. The synopses that these systems build are costly to compute and may also be costly to store. We introduce Overlook, a system that enables private data exploration at interactive latencies for both data analysts and data curators. The key idea in Overlook is a virtual synopsis that can be evaluated incrementally, without extra space storage or expensive precomputation. Overlook simply executes queries using an existing engine, such as a SQL DBMS, and adds noise to their results. Because Overlook's synopses do not require costly precomputation or storage, data curators can also use Overlook to explore the impact of privacy parameters interactively. Overlook offers a rich visual query interface based on the open source Hillview system. Overlook achieves accuracy comparable to existing synopsis-based systems, while offering better performance and removing the need for extra storage.

preprint2016arXiv

Degree and Sensitivity: tails of two distributions

The sensitivity of a Boolean function f is the maximum over all inputs x, of the number of sensitive coordinates of x. The well-known sensitivity conjecture of Nisan (see also Nisan and Szegedy) states that every sensitivity-s Boolean function can be computed by a polynomial over the reals of degree poly(s). The best known upper bounds on degree, however, are exponential rather than polynomial in s. Our main result is an approximate version of the conjecture: every Boolean function with sensitivity s can be epsilon-approximated (in L_2) by a polynomial whose degree is O(s log(1/epsilon)). This is the first improvement on the folklore bound of s/epsilon. Further, we show that improving the bound to O(s^c log(1/epsilon)^d)$ for any d < 1 and any c > 0 will imply the sensitivity conjecture. Thus our result is essentially the best one can hope for without proving the conjecture. We postulate a robust analogue of the sensitivity conjecture: if most inputs to a Boolean function f have low sensitivity, then most of the Fourier mass of f is concentrated on small subsets, and present an approach towards proving this conjecture.

preprint2016arXiv

Maximally Recoverable Codes for Grid-like Topologies

The explosion in the volumes of data being stored online has resulted in distributed storage systems transitioning to erasure coding based schemes. Yet, the codes being deployed in practice are fairly short. In this work, we address what we view as the main coding theoretic barrier to deploying longer codes in storage: at large lengths, failures are not independent and correlated failures are inevitable. This motivates designing codes that allow quick data recovery even after large correlated failures, and which have efficient encoding and decoding. We propose that code design for distributed storage be viewed as a two-step process. The first step is choose a topology of the code, which incorporates knowledge about the correlated failures that need to be handled, and ensures local recovery from such failures. In the second step one specifies a code with the chosen topology by choosing coefficients from a finite field. In this step, one tries to balance reliability (which is better over larger fields) with encoding and decoding efficiency (which is better over smaller fields). This work initiates an in-depth study of this reliability/efficiency tradeoff. We consider the field-size needed for achieving maximal recoverability: the strongest reliability possible with a given topology. We propose a family of topologies called grid-like topologies which unify a number of topologies considered both in theory and practice, and prove a collection of results about maximally recoverable codes in such topologies including the first super-polynomial lower bound on the field size.

preprint2015arXiv

Inequalities and tail bounds for elementary symmetric polynomial with applications

We study the extent of independence needed to approximate the product of bounded random variables in expectation, a natural question that has applications in pseudorandomness and min-wise independent hashing. For random variables whose absolute value is bounded by $1$, we give an error bound of the form $σ^{Ω(k)}$ where $k$ is the amount of independence and $σ^2$ is the total variance of the sum. Previously known bounds only applied in more restricted settings, and were quanitively weaker. We use this to give a simpler and more modular analysis of a construction of min-wise independent hash functions and pseudorandom generators for combinatorial rectangles due to Gopalan et al., which also slightly improves their seed-length. Our proof relies on a new analytic inequality for the elementary symmetric polynomials $S_k(x)$ for $x \in \mathbb{R}^n$ which we believe to be of independent interest. We show that if $|S_k(x)|,|S_{k+1}(x)|$ are small relative to $|S_{k-1}(x)|$ for some $k>0$ then $|S_\ell(x)|$ is also small for all $\ell > k$. From these, we derive tail bounds for the elementary symmetric polynomials when the inputs are only $k$-wise independent.

preprint2015arXiv

Pseudorandomness via the discrete Fourier transform

We present a new approach to constructing unconditional pseudorandom generators against classes of functions that involve computing a linear function of the inputs. We give an explicit construction of a pseudorandom generator that fools the discrete Fourier transforms of linear functions with seed-length that is nearly logarithmic (up to polyloglog factors) in the input size and the desired error parameter. Our result gives a single pseudorandom generator that fools several important classes of tests computable in logspace that have been considered in the literature, including halfspaces (over general domains), modular tests and combinatorial shapes. For all these classes, our generator is the first that achieves near logarithmic seed-length in both the input length and the error parameter. Getting such a seed-length is a natural challenge in its own right, which needs to be overcome in order to derandomize RL - a central question in complexity theory. Our construction combines ideas from a large body of prior work, ranging from a classical construction of [NN93] to the recent gradually increasing independence paradigm of [KMN11, CRSW13, GMRTV12], while also introducing some novel analytic machinery which might find other applications.

preprint2015arXiv

Public projects, Boolean functions and the borders of Border's theorem

Border's theorem gives an intuitive linear characterization of the feasible interim allocation rules of a Bayesian single-item environment, and it has several applications in economic and algorithmic mechanism design. All known generalizations of Border's theorem either restrict attention to relatively simple settings, or resort to approximation. This paper identifies a complexity-theoretic barrier that indicates, assuming standard complexity class separations, that Border's theorem cannot be extended significantly beyond the state-of-the-art. We also identify a surprisingly tight connection between Myerson's optimal auction theory, when applied to public project settings, and some fundamental results in the analysis of Boolean functions.

preprint2015arXiv

Smooth Boolean functions are easy: efficient algorithms for low-sensitivity functions

A natural measure of smoothness of a Boolean function is its sensitivity (the largest number of Hamming neighbors of a point which differ from it in function value). The structure of smooth or equivalently low-sensitivity functions is still a mystery. A well-known conjecture states that every such Boolean function can be computed by a shallow decision tree. While this conjecture implies that smooth functions are easy to compute in the simplest computational model, to date no non-trivial upper bounds were known for such functions in any computational model, including unrestricted Boolean circuits. Even a bound on the description length of such functions better than the trivial $2^n$ does not seem to have been known. In this work, we establish the first computational upper bounds on smooth Boolean functions: 1) We show that every sensitivity s function is uniquely specified by its values on a Hamming ball of radius 2s. We use this to show that such functions can be computed by circuits of size $n^{O(s)}$. 2) We show that sensitivity s functions satisfy a strong pointwise noise-stability guarantee for random noise of rate O(1/s). We use this to show that these functions have formulas of depth O(s log n). 3) We show that sensitivity s functions can be (locally) self-corrected from worst-case noise of rate $\exp(-O(s))$. All our results are simple, and follow rather directly from (variants of) the basic fact that that the function value at few points in small neighborhoods of a given point determine its function value via a majority vote. Our results confirm various consequences of the conjecture. They may be viewed as providing a new form of evidence towards its validity, as well as new directions towards attacking it.

preprint2014arXiv

Pseudorandomness for concentration bounds and signed majorities

The problem of constructing pseudorandom generators that fool halfspaces has been studied intensively in recent times. For fooling halfspaces over the hypercube with polynomially small error, the best construction known requires seed-length O(log^2 n) (MekaZ13). Getting the seed-length down to O(log(n)) is a natural challenge in its own right, which needs to be overcome in order to derandomize RL. In this work we make progress towards this goal by obtaining near-optimal generators for two important special cases: 1) We give a near optimal derandomization of the Chernoff bound for independent, uniformly random bits. Specifically, we show how to generate a x in {1,-1}^n using $\tilde{O}(\log (n/ε))$ random bits such that for any unit vector u, <u,x> matches the sub-Gaussian tail behaviour predicted by the Chernoff bound up to error eps. 2) We construct a generator which fools halfspaces with {0,1,-1} coefficients with error eps with a seed-length of $\tilde{O}(\log(n/ε))$. This includes the important special case of majorities. In both cases, the best previous results required seed-length of $O(\log n + \log^2(1/ε))$. Technically, our work combines new Fourier-analytic tools with the iterative dimension reduction techniques and the gradually increasing independence paradigm of previous works (KaneMN11, CelisRSW13, GopalanMRTV12).

preprint2013arXiv

Explicit Maximally Recoverable Codes with Locality

Consider a systematic linear code where some (local) parity symbols depend on few prescribed symbols, while other (heavy) parity symbols may depend on all data symbols. Local parities allow to quickly recover any single symbol when it is erased, while heavy parities provide tolerance to a large number of simultaneous erasures. A code as above is maximally-recoverable if it corrects all erasure patterns which are information theoretically recoverable given the code topology. In this paper we present explicit families of maximally-recoverable codes with locality. We also initiate the study of the trade-off between maximal recoverability and alphabet size.

preprint2013arXiv

Locally Testable Codes and Cayley Graphs

We give two new characterizations of ($\F_2$-linear) locally testable error-correcting codes in terms of Cayley graphs over $\F_2^h$: \begin{enumerate} \item A locally testable code is equivalent to a Cayley graph over $\F_2^h$ whose set of generators is significantly larger than $h$ and has no short linear dependencies, but yields a shortest-path metric that embeds into $\ell_1$ with constant distortion. This extends and gives a converse to a result of Khot and Naor (2006), which showed that codes with large dual distance imply Cayley graphs that have no low-distortion embeddings into $\ell_1$. \item A locally testable code is equivalent to a Cayley graph over $\F_2^h$ that has significantly more than $h$ eigenvalues near 1, which have no short linear dependencies among them and which "explain" all of the large eigenvalues. This extends and gives a converse to a recent construction of Barak et al. (2012), which showed that locally testable codes imply Cayley graphs that are small-set expanders but have many large eigenvalues. \end{enumerate}

preprint2012arXiv

Better Pseudorandom Generators from Milder Pseudorandom Restrictions

We present an iterative approach to constructing pseudorandom generators, based on the repeated application of mild pseudorandom restrictions. We use this template to construct pseudorandom generators for combinatorial rectangles and read-once CNFs and a hitting set generator for width-3 branching programs, all of which achieve near-optimal seed-length even in the low-error regime: We get seed-length O(log (n/epsilon)) for error epsilon. Previously, only constructions with seed-length O(\log^{3/2} n) or O(\log^2 n) were known for these classes with polynomially small error. The (pseudo)random restrictions we use are milder than those typically used for proving circuit lower bounds in that we only set a constant fraction of the bits at a time. While such restrictions do not simplify the functions drastically, we show that they can be derandomized using small-bias spaces.

preprint2011arXiv

Making the long code shorter, with applications to the Unique Games Conjecture

The long code is a central tool in hardness of approximation, especially in questions related to the unique games conjecture. We construct a new code that is exponentially more efficient, but can still be used in many of these applications. Using the new code we obtain exponential improvements over several known results, including the following: 1. For any eps > 0, we show the existence of an n vertex graph G where every set of o(n) vertices has expansion 1 - eps, but G's adjacency matrix has more than exp(log^delta n) eigenvalues larger than 1 - eps, where delta depends only on eps. This answers an open question of Arora, Barak and Steurer (FOCS 2010) who asked whether one can improve over the noise graph on the Boolean hypercube that has poly(log n) such eigenvalues. 2. A gadget that reduces unique games instances with linear constraints modulo K into instances with alphabet k with a blowup of K^polylog(K), improving over the previously known gadget with blowup of 2^K. 3. An n variable integrality gap for Unique Games that that survives exp(poly(log log n)) rounds of the SDP + Sherali Adams hierarchy, improving on the previously known bound of poly(log log n). We show a connection between the local testability of linear codes and small set expansion in certain related Cayley graphs, and use this connection to derandomize the noise graph on the Boolean hypercube.

preprint2011arXiv

On the Locality of Codeword Symbols

Consider a linear [n,k,d]_q code C. We say that that i-th coordinate of C has locality r, if the value at this coordinate can be recovered from accessing some other r coordinates of C. Data storage applications require codes with small redundancy, low locality for information coordinates, large distance, and low locality for parity coordinates. In this paper we carry out an in-depth study of the relations between these parameters. We establish a tight bound for the redundancy n-k in terms of the message length, the distance, and the locality of information coordinates. We refer to codes attaining the bound as optimal. We prove some structure theorems about optimal codes, which are particularly strong for small distances. This gives a fairly complete picture of the tradeoffs between codewords length, worst-case distance and locality of information symbols. We then consider the locality of parity check symbols and erasure correction beyond worst case distance for optimal codes. Using our structure theorem, we obtain a tight bound for the locality of parity symbols possible in such codes for a broad class of parameter settings. We prove that there is a tradeoff between having good locality for parity checks and the ability to correct erasures beyond the minimum distance.

preprint2010arXiv

Polynomial-Time Approximation Schemes for Knapsack and Related Counting Problems using Branching Programs

We give a deterministic, polynomial-time algorithm for approximately counting the number of {0,1}-solutions to any instance of the knapsack problem. On an instance of length n with total weight W and accuracy parameter eps, our algorithm produces a (1 + eps)-multiplicative approximation in time poly(n,log W,1/eps). We also give algorithms with identical guarantees for general integer knapsack, the multidimensional knapsack problem (with a constant number of constraints) and for contingency tables (with a constant number of rows). Previously, only randomized approximation schemes were known for these problems due to work by Morris and Sinclair and work by Dyer. Our algorithms work by constructing small-width, read-once branching programs for approximating the underlying solution space under a carefully chosen distribution. As a byproduct of this approach, we obtain new query algorithms for learning functions of k halfspaces with respect to the uniform distribution on {0,1}^n. The running time of our algorithm is polynomial in the accuracy parameter eps. Previously even for the case of k=2, only algorithms with an exponential dependence on eps were known.

preprint2007arXiv

Polynomials that Sign Represent Parity and Descartes' Rule of Signs

A real polynomial $P(X_1,..., X_n)$ sign represents $f: A^n \to \{0,1\}$ if for every $(a_1, ..., a_n) \in A^n$, the sign of $P(a_1,...,a_n)$ equals $(-1)^{f(a_1,...,a_n)}$. Such sign representations are well-studied in computer science and have applications to computational complexity and computational learning theory. In this work, we present a systematic study of tradeoffs between degree and sparsity of sign representations through the lens of the parity function. We attempt to prove bounds that hold for any choice of set $A$. We show that sign representing parity over $\{0,...,m-1\}^n$ with the degree in each variable at most $m-1$ requires sparsity at least $m^n$. We show that a tradeoff exists between sparsity and degree, by exhibiting a sign representation that has higher degree but lower sparsity. We show a lower bound of $n(m -2) + 1$ on the sparsity of polynomials of any degree representing parity over $\{0,..., m-1\}^n$. We prove exact bounds on the sparsity of such polynomials for any two element subset $A$. The main tool used is Descartes' Rule of Signs, a classical result in algebra, relating the sparsity of a polynomial to its number of real roots. As an application, we use bounds on sparsity to derive circuit lower bounds for depth-two AND-OR-NOT circuits with a Threshold Gate at the top. We use this to give a simple proof that such circuits need size $1.5^n$ to compute parity, which improves the previous bound of ${4/3}^{n/2}$ due to Goldmann (1997). We show a tight lower bound of $2^n$ for the inner product function over $\{0,1\}^n \times \{0, 1\}^n$.

Parikshit Gopalan

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Flexible Routing via Uncertainty Decomposition

KL Divergence Estimation with Multi-group Attribution

Low-Degree Multicalibration

Overlook: Differentially Private Exploratory Visualization for Big Data

Degree and Sensitivity: tails of two distributions

Maximally Recoverable Codes for Grid-like Topologies

Inequalities and tail bounds for elementary symmetric polynomial with applications

Pseudorandomness via the discrete Fourier transform

Public projects, Boolean functions and the borders of Border's theorem

Smooth Boolean functions are easy: efficient algorithms for low-sensitivity functions

Pseudorandomness for concentration bounds and signed majorities

Explicit Maximally Recoverable Codes with Locality

Locally Testable Codes and Cayley Graphs

Better Pseudorandom Generators from Milder Pseudorandom Restrictions

Making the long code shorter, with applications to the Unique Games Conjecture

On the Locality of Codeword Symbols

Polynomial-Time Approximation Schemes for Knapsack and Related Counting Problems using Branching Programs

Polynomials that Sign Represent Parity and Descartes' Rule of Signs