Source author record

Jieming Mao

Jieming Mao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning Computational Complexity Computer Science and Game Theory Cryptography and Security Information Theory math.IT math.OC quant-ph

Catalog footprint

What is connected

14works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Differentially Private Continual Releases of Streaming Frequency Moment Estimations

The streaming model of computation is a popular approach for working with large-scale data. In this setting, there is a stream of items and the goal is to compute the desired quantities (usually data statistics) while making a single pass through the stream and using as little space as possible. Motivated by the importance of data privacy, we develop differentially private streaming algorithms under the continual release setting, where the union of outputs of the algorithm at every timestamp must be differentially private. Specifically, we study the fundamental $\ell_p$ $(p\in [0,+\infty))$ frequency moment estimation problem under this setting, and give an $\varepsilon$-DP algorithm that achieves $(1+η)$-relative approximation $(\forall η\in(0,1))$ with $\mathrm{poly}\log(Tn)$ additive error and uses $\mathrm{poly}\log(Tn)\cdot \max(1, n^{1-2/p})$ space, where $T$ is the length of the stream and $n$ is the size of the universe of elements. Our space is near optimal up to poly-logarithmic factors even in the non-private setting. To obtain our results, we first reduce several primitives under the differentially private continual release model, such as counting distinct elements, heavy hitters and counting low frequency elements, to the simpler, counting/summing problems in the same setting. Based on these primitives, we develop a differentially private continual release level set estimation approach to address the $\ell_p$ frequency moment estimation problem. We also provide a simple extension of our results to the harder sliding window model, where the statistics must be maintained over the past $W$ data items.

preprint2022arXiv

Efficiency of the First-Price Auction in the Autobidding World

We study the price of anarchy of the first-price auction in the autobidding world, where bidders can be either utility maximizers (i.e., traditional bidders) or value maximizers (i.e., autobidders). We show that with autobidders only, the price of anarchy of the first-price auction is $1/2$, and with both kinds of bidders, the price of anarchy degrades to about $0.457$ (the precise number is given by an optimization). These results complement the recent result by Jin and Lu [2022] showing that the price of anarchy of the first-price auction with traditional bidders only is $1 - 1/e^2$. We further investigate a setting where the seller can utilize machine-learned advice to improve the efficiency of the auctions. There, we show that as the accuracy of the advice increases, the price of anarchy improves smoothly from about $0.457$ to $1$.

preprint2022arXiv

Regret Minimization with Noisy Observations

In a typical optimization problem, the task is to pick one of a number of options with the lowest cost or the highest value. In practice, these cost/value quantities often come through processes such as measurement or machine learning, which are noisy, with quantifiable noise distributions. To take these noise distributions into account, one approach is to assume a prior for the values, use it to build a posterior, and then apply standard stochastic optimization to pick a solution. However, in many practical applications, such prior distributions may not be available. In this paper, we study such scenarios using a regret minimization model. In our model, the task is to pick the highest one out of $n$ values. The values are unknown and chosen by an adversary, but can be observed through noisy channels, where additive noises are stochastically drawn from known distributions. The goal is to minimize the regret of our selection, defined as the expected difference between the highest and the selected value on the worst-case choices of values. We show that the naïve algorithm of picking the highest observed value has regret arbitrarily worse than the optimum, even when $n = 2$ and the noises are unbiased in expectation. On the other hand, we propose an algorithm which gives a constant-approximation to the optimal regret for any $n$. Our algorithm is conceptually simple, computationally efficient, and requires only minimal knowledge of the noise distributions.

preprint2022arXiv

Shuffle Private Stochastic Convex Optimization

In shuffle privacy, each user sends a collection of randomized messages to a trusted shuffler, the shuffler randomly permutes these messages, and the resulting shuffled collection of messages must satisfy differential privacy. Prior work in this model has largely focused on protocols that use a single round of communication to compute algorithmic primitives like means, histograms, and counts. We present interactive shuffle protocols for stochastic convex optimization. Our protocols rely on a new noninteractive protocol for summing vectors of bounded $\ell_2$ norm. By combining this sum subroutine with mini-batch stochastic gradient descent, accelerated gradient descent, and Nesterov's smoothing method, we obtain loss guarantees for a variety of convex loss functions that significantly improve on those of the local model and sometimes match those of the central model.

preprint2020arXiv

Connecting Robust Shuffle Privacy and Pan-Privacy

In the \emph{shuffle model} of differential privacy, data-holding users send randomized messages to a secure shuffler, the shuffler permutes the messages, and the resulting collection of messages must be differentially private with regard to user data. In the \emph{pan-private} model, an algorithm processes a stream of data while maintaining an internal state that is differentially private with regard to the stream data. We give evidence connecting these two apparently different models. Our results focus on \emph{robustly} shuffle private protocols, whose privacy guarantees are not greatly affected by malicious users. First, we give robustly shuffle private protocols and upper bounds for counting distinct elements and uniformity testing. Second, we use pan-private lower bounds to prove robustly shuffle private lower bounds for both problems. Focusing on the dependence on the domain size $k$, we find that robust approximate shuffle privacy and approximate pan-privacy have additive error $Θ(\sqrt{k})$ for counting distinct elements. For uniformity testing, we give a robust approximate shuffle private protocol with sample complexity $\tilde O(k^{2/3})$ and show that an $Ω(k^{2/3})$ dependence is necessary for any robust pure shuffle private tester. Finally, we show that this connection is useful in both directions: we give a pan-private adaptation of recent work on shuffle private histograms and use it to recover further separations between pan-privacy and interactive local privacy.

preprint2020arXiv

Pan-Private Uniformity Testing

A centrally differentially private algorithm maps raw data to differentially private outputs. In contrast, a locally differentially private algorithm may only access data through public interaction with data holders, and this interaction must be a differentially private function of the data. We study the intermediate model of pan-privacy. Unlike a locally private algorithm, a pan-private algorithm receives data in the clear. Unlike a centrally private algorithm, the algorithm receives data one element at a time and must maintain a differentially private internal state while processing this stream. First, we show that pure pan-privacy against multiple intrusions on the internal state is equivalent to sequentially interactive local privacy. Next, we contextualize pan-privacy against a single intrusion by analyzing the sample complexity of uniformity testing over domain $[k]$. Focusing on the dependence on $k$, centrally private uniformity testing has sample complexity $Θ(\sqrt{k})$, while noninteractive locally private uniformity testing has sample complexity $Θ(k)$. We show that the sample complexity of pure pan-private uniformity testing is $Θ(k^{2/3})$. By a new $Ω(k)$ lower bound for the sequentially interactive setting, we also separate pan-private from sequentially interactive locally private and multi-intrusion pan-private uniformity testing.

preprint2016arXiv

Coding for interactive communication correcting insertions and deletions

We consider the question of interactive communication, in which two remote parties perform a computation while their communication channel is (adversarially) noisy. We extend here the discussion into a more general and stronger class of noise, namely, we allow the channel to perform insertions and deletions of symbols. These types of errors may bring the parties "out of sync", so that there is no consensus regarding the current round of the protocol. In this more general noise model, we obtain the first interactive coding scheme that has a constant rate and resists noise rates of up to $1/18-\varepsilon$. To this end we develop a novel primitive we name edit distance tree code. The edit distance tree code is designed to replace the Hamming distance constraints in Schulman's tree codes (STOC 93), with a stronger edit distance requirement. However, the straightforward generalization of tree codes to edit distance does not seem to yield a primitive that suffices for communication in the presence of synchronization problems. Giving the "right" definition of edit distance tree codes is a main conceptual contribution of this work.

preprint2016arXiv

Competitive analysis of the top-K ranking problem

Motivated by applications in recommender systems, web search, social choice and crowdsourcing, we consider the problem of identifying the set of top $K$ items from noisy pairwise comparisons. In our setting, we are non-actively given $r$ pairwise comparisons between each pair of $n$ items, where each comparison has noise constrained by a very general noise model called the strong stochastic transitivity (SST) model. We analyze the competitive ratio of algorithms for the top-$K$ problem. In particular, we present a linear time algorithm for the top-$K$ problem which has a competitive ratio of $\tilde{O}(\sqrt{n})$; i.e. to solve any instance of top-$K$, our algorithm needs at most $\tilde{O}(\sqrt{n})$ times as many samples needed as the best possible algorithm for that instance (in contrast, all previous known algorithms for the top-$K$ problem have competitive ratios of $\tildeΩ(n)$ or worse). We further show that this is tight: any algorithm for the top-$K$ problem has competitive ratio at least $\tildeΩ(\sqrt{n})$.

preprint2016arXiv

Parallel Algorithms for Select and Partition with Noisy Comparisons

We consider the problem of finding the $k^{th}$ highest element in a totally ordered set of $n$ elements (select), and partitioning a totally ordered set into the top $k$ and bottom $n-k$ elements (partition) using pairwise comparisons. Motivated by settings like peer grading or crowdsourcing, where multiple rounds of interaction are costly and queried comparisons may be inconsistent with the ground truth, we evaluate algorithms based both on their total runtime and the number of interactive rounds in three comparison models: noiseless (where the comparisons are correct), erasure (where comparisons are erased with probability $1-γ$), and noisy (where comparisons are correct with probability $1/2+γ/2$ and incorrect otherwise). We provide numerous matching upper and lower bounds in all three models. Even our results in the noiseless model, which is quite well-studied in the TCS literature on parallel algorithms, are novel.

preprint2015arXiv

Interpolating Between Truthful and non-Truthful Mechanisms for Combinatorial Auctions

We study the communication complexity of combinatorial auctions via interpolation mechanisms that interpolate between non-truthful and truthful protocols. Specifically, an interpolation mechanism has two phases. In the first phase, the bidders participate in some non-truthful protocol whose output is itself a truthful protocol. In the second phase, the bidders participate in the truthful protocol selected during phase one. Note that virtually all existing auctions have either a non-existent first phase (and are therefore truthful mechanisms), or a non-existent second phase (and are therefore just traditional protocols, analyzed via the Price of Anarchy/Stability). The goal of this paper is to understand the benefits of interpolation mechanisms versus truthful mechanisms or traditional protocols, and develop the necessary tools to formally study them. Interestingly, we exhibit settings where interpolation mechanisms greatly outperform the optimal traditional and truthful protocols. Yet, we also exhibit settings where interpolation mechanisms are provably no better than truthful ones. Finally, we apply our new machinery to prove that the recent single-bid mechanism of Devanur et. al.~\cite{DevanurMSW15} (the only pre-existing interpolation mechanism in the literature) achieves the optimal price of anarchy among a wide class of protocols, a claim that simply can't be addressed by appealing just to machinery from communication complexity or the study of truthful mechanisms.

preprint2015arXiv

Near-optimal bounds on bounded-round quantum communication complexity of disjointness

We prove a near optimal round-communication tradeoff for the two-party quantum communication complexity of disjointness. For protocols with $r$ rounds, we prove a lower bound of $\tildeΩ(n/r + r)$ on the communication required for computing disjointness of input size $n$, which is optimal up to logarithmic factors. The previous best lower bound was $Ω(n/r^2 + r)$ due to Jain, Radhakrishnan and Sen [JRS03]. Along the way, we develop several tools for quantum information complexity, one of which is a lower bound for quantum information complexity in terms of the generalized discrepancy method. As a corollary, we get that the quantum communication complexity of any boolean function $f$ is at most $2^{O(QIC(f))}$, where $QIC(f)$ is the prior-free quantum information complexity of $f$ (with error $1/3$).

preprint2014arXiv

Simulating Noisy Channel Interaction

We show that $T$ rounds of interaction over the binary symmetric channel $BSC_{1/2-ε}$ with feedback can be simulated with $O(ε^2 T)$ rounds of interaction over a noiseless channel. We also introduce a more general "energy cost" model of interaction over a noisy channel. We show energy cost to be equivalent to external information complexity, which implies that our simulation results are unlikely to carry over to energy complexity. Our main technical innovation is a self-reduction from simulating a noisy channel to simulating a slightly-less-noisy channel, which may have other applications in the area of interactive compression.

preprint2014arXiv

Tighter Relations Between Sensitivity and Other Complexity Measures

Sensitivity conjecture is a longstanding and fundamental open problem in the area of complexity measures of Boolean functions and decision tree complexity. The conjecture postulates that the maximum sensitivity of a Boolean function is polynomially related to other major complexity measures. Despite much attention to the problem and major advances in analysis of Boolean functions in the past decade, the problem remains wide open with no positive result toward the conjecture since the work of Kenyon and Kutin from 2004. In this work, we present new upper bounds for various complexity measures in terms of sensitivity improving the bounds provided by Kenyon and Kutin. Specifically, we show that deg(f)^{1-o(1)}=O(2^{s(f)}) and C(f) < 2^{s(f)-1} s(f); these in turn imply various corollaries regarding the relation between sensitivity and other complexity measures, such as block sensitivity, via known results. The gap between sensitivity and other complexity measures remains exponential but these results are the first improvement for this difficult problem that has been achieved in a decade.

preprint2013arXiv

New upper bound on block sensitivity and certificate complexity in terms of sensitivity

Sensitivity \cite{CD82,CDR86} and block sensitivity \cite{Nisan91} are two important complexity measures of Boolean functions. A longstanding open problem in decision tree complexity, the "Sensitivity versus Block Sensitivity" question, proposed by Nisan and Szegedy \cite{Nisan94} in 1992, is whether these two complexity measures are polynomially related, i.e., whether $bs(f)=O(s(f)^{O(1)})$. We prove an new upper bound on block sensitivity in terms of sensitivity: $bs(f) \leq 2^{s(f)-1} s(f)$. Previously, the best upper bound on block sensitivity was $bs(f) \leq (\frac{e}{\sqrt{2π}}) e^{s(f)} \sqrt{s(f)}$ by Kenyon and Kutin \cite{KK}. We also prove that if $\min\{s_0(f),s_1(f)\}$ is a constant, then sensitivity and block sensitivity are linearly related, i.e. $bs(f)=O(s(f))$.

Jieming Mao

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Differentially Private Continual Releases of Streaming Frequency Moment Estimations

Efficiency of the First-Price Auction in the Autobidding World

Regret Minimization with Noisy Observations

Shuffle Private Stochastic Convex Optimization

Connecting Robust Shuffle Privacy and Pan-Privacy

Pan-Private Uniformity Testing

Coding for interactive communication correcting insertions and deletions

Competitive analysis of the top-K ranking problem

Parallel Algorithms for Select and Partition with Noisy Comparisons

Interpolating Between Truthful and non-Truthful Mechanisms for Combinatorial Auctions

Near-optimal bounds on bounded-round quantum communication complexity of disjointness

Simulating Noisy Channel Interaction

Tighter Relations Between Sensitivity and Other Complexity Measures

New upper bound on block sensitivity and certificate complexity in terms of sensitivity