Source author record

Mohammad Bavarian

Mohammad Bavarian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity quant-ph Computation and Language Information Theory math.IT

Catalog footprint

What is connected

8works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Training of Language Models to Fill in the Middle

We show that autoregressive language models can learn to infill text after we apply a straightforward transformation to the dataset, which simply moves a span of text from the middle of a document to its end. While this data augmentation has garnered much interest in recent years, we provide extensive evidence that training models with a large fraction of data transformed in this way does not harm the original left-to-right generative capability, as measured by perplexity and sampling evaluations across a wide range of scales. Given the usefulness, simplicity, and efficiency of training models to fill-in-the-middle (FIM), we suggest that future autoregressive language models be trained with FIM by default. To this end, we run a series of ablations on key hyperparameters, such as the data transformation frequency, the structure of the transformation, and the method of selecting the infill span. We use these ablations to prescribe strong default settings and best practices to train FIM models. We have released our best infilling model trained with best practices in our API, and release our infilling benchmarks to aid future research.

preprint2021arXiv

Anchored parallel repetition for nonlocal games

We introduce a simple transformation on two-player nonlocal games, called "anchoring", and prove an exponential-decay parallel repetition theorem for all anchored games in the setting of quantum entangled players. This transformation is inspired in part by the Feige-Kilian transformation (SICOMP 2000), and has the property that if the quantum value of the original game $G$ is $v$ then the quantum value of the anchored game $G_\bot$ is $1 - (1 - α)^2 \cdot (1 - v)$ where $α$ is a parameter of the transformation. In particular the anchored game has quantum value $1$ if and only if the original game $G$ has quantum value $1$. This provides the first gap amplification technique for general two-player nonlocal games that achieves exponential decay of the quantum value.

preprint2016arXiv

Parallel repetition via fortification: analytic view and the quantum case

In a recent work, Moshkovitz [FOCS '14] presented a transformation on two-player games called "fortification", and gave an elementary proof of an (exponential decay) parallel repetition theorem for fortified two-player projection games. In this paper, we give an analytic reformulation of Moshkovitz's fortification framework, which was originally cast in combinatorial terms. This reformulation allows us to expand the scope of the fortification method to new settings. First, we show any game (not just projection games) can be fortified, and give a simple proof of parallel repetition for general fortified games. Then, we prove parallel repetition and fortification theorems for games with players sharing quantum entanglement, as well as games with more than two players. This gives a new gap amplification method for general games in the quantum and multiplayer settings, which has recently received much interest. An important component of our work is a variant of the fortification transformation, called "ordered fortification", that preserves the entangled value of a game. The original fortification of Moshkovitz does not in general preserve the entangled value of a game, and this was a barrier to extending the fortification framework to the quantum setting.

preprint2015arXiv

On the Role of Shared Randomness in Simultaneous Communication

Two parties wish to carry out certain distributed computational tasks, and they are given access to a source of correlated random bits. It allows the parties to act in a correlated manner, which can be quite useful. But what happens if the shared randomness is not perfect? In this work, we initiate the study of the power of different sources of shared randomness in communication complexity. This is done in the setting of simultaneous message passing (SMP) model of communication complexity, which is one of the most suitable models for studying the resource of shared randomness. Toward characterising the power of various sources of shared randomness, we introduce a measure for the quality of a source - we call it collision complexity. Our results show that the collision complexity tightly characterises the power of a (shared) randomness resource in the SMP model. Of independent interest is our demonstration that even the weakest sources of shared randomness can in some cases increase the power of SMP substantially: the equality function can be solved very efficiently with virtually any nontrivial shared randomness.

preprint2014arXiv

Information Causality, Szemerédi-Trotter and Algebraic Variants of CHSH

In this work, we consider the following family of two prover one-round games. In the CHSH_q game, two parties are given x,y in F_q uniformly at random, and each must produce an output a,b in F_q without communicating with the other. The players' objective is to maximize the probability that their outputs satisfy a+b=xy in F_q. This game was introduced by Buhrman and Massar (PRA 2005) as a large alphabet generalization of the celebrated CHSH game---which is one of the most well-studied two-prover games in quantum information theory, and which has a large number of applications to quantum cryptography and quantum complexity. Our main contributions in this paper are the first asymptotic and explicit bounds on the entangled and classical values of CHSH_q, and the realization of a rather surprising connection between CHSH_q and geometric incidence theory. On the way to these results, we also resolve a problem of Pawlowski and Winter about pairwise independent Information Causality, which, beside being interesting on its own, gives as an application a short proof of our upper bound for the entangled value of CHSH_q.

preprint2014arXiv

On the sum of $L1$ influences

For a function $f$ over the discrete cube, the total $L_1$ influence of $f$ is defined as $\sum_{i=1}^n \|\partial_i f\|_1$, where $\partial_i f$ denotes the discrete derivative of $f$ in the direction $i$. In this work, we show that the total $L_1$ influence of a $[-1,1]$-valued function $f$ can be upper bounded by a polynomial in the degree of $f$, resolving affirmatively an open problem of Aaronson and Ambainis (ITCS 2011). The main challenge here is that the $L_1$ influences do not admit an easy Fourier analytic representation. In our proof, we overcome this problem by introducing a new analytic quantity $\mathcal I_p(f)$, relating this new quantity to the total $L_1$ influence of $f$. This new quantity, which roughly corresponds to an average of the total $L_1$ influences of some ensemble of functions related to $f$, has the benefit of being much easier to analyze, allowing us to resolve the problem of Aaronson and Ambainis. We also give an application of the theorem to graph theory, and discuss the connection between the study of bounded functions over the cube and the quantum query complexity of partial functions where Aaronson and Ambainis encountered this question.

preprint2014arXiv

Tighter Relations Between Sensitivity and Other Complexity Measures

Sensitivity conjecture is a longstanding and fundamental open problem in the area of complexity measures of Boolean functions and decision tree complexity. The conjecture postulates that the maximum sensitivity of a Boolean function is polynomially related to other major complexity measures. Despite much attention to the problem and major advances in analysis of Boolean functions in the past decade, the problem remains wide open with no positive result toward the conjecture since the work of Kenyon and Kutin from 2004. In this work, we present new upper bounds for various complexity measures in terms of sensitivity improving the bounds provided by Kenyon and Kutin. Specifically, we show that deg(f)^{1-o(1)}=O(2^{s(f)}) and C(f) < 2^{s(f)-1} s(f); these in turn imply various corollaries regarding the relation between sensitivity and other complexity measures, such as block sensitivity, via known results. The gap between sensitivity and other complexity measures remains exponential but these results are the first improvement for this difficult problem that has been achieved in a decade.

preprint2013arXiv

Weak Parity

We study the query complexity of Weak Parity: the problem of computing the parity of an n-bit input string, where one only has to succeed on a 1/2+eps fraction of input strings, but must do so with high probability on those inputs where one does succeed. It is well-known that n randomized queries and n/2 quantum queries are needed to compute parity on all inputs. But surprisingly, we give a randomized algorithm for Weak Parity that makes only O(n/log^0.246(1/eps)) queries, as well as a quantum algorithm that makes only O(n/sqrt(log(1/eps))) queries. We also prove a lower bound of Omega(n/log(1/eps)) in both cases; and using extremal combinatorics, prove lower bounds of Omega(log n) in the randomized case and Omega(sqrt(log n)) in the quantum case for any eps>0. We show that improving our lower bounds is intimately related to two longstanding open problems about Boolean functions: the Sensitivity Conjecture, and the relationships between query complexity and polynomial degree.

Mohammad Bavarian

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Efficient Training of Language Models to Fill in the Middle

Anchored parallel repetition for nonlocal games

Parallel repetition via fortification: analytic view and the quantum case

On the Role of Shared Randomness in Simultaneous Communication

Information Causality, Szemerédi-Trotter and Algebraic Variants of CHSH

On the sum of $L1$ influences

Tighter Relations Between Sensitivity and Other Complexity Measures

Weak Parity