Researcher profile

Afonso S. Bandeira

Afonso S. Bandeira contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
19topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2022arXiv

A remark on Kashin's discrepancy argument and partial coloring in the Komlós conjecture

In this expository note, we discuss an early partial coloring result of B. Kashin [C. R. Acad. Bulgare Sci., 1985]. Although this result only implies Spencer's six standard deviations [Trans. Amer. Math. Soc., 1985] up to a $\log\log n$ factor, Kashin's argument gives a simple proof of the existence of a constant discrepancy partial coloring in the setup of Komlós conjecture.

preprint2022arXiv

Community Detection with a Subsampled Semidefinite Program

Semidefinite programming is an important tool to tackle several problems in data science and signal processing, including clustering and community detection. However, semidefinite programs are often slow in practice, so speed up techniques such as sketching are often considered. In the context of community detection in the stochastic block model, Mixon and Xie \cite{mixon2020sketching} have recently proposed a sketching framework in which a semidefinite program is solved only on a subsampled subgraph of the network, giving rise to significant computational savings. In this short paper, we provide a positive answer to a conjecture of Mixon and Xie about the statistical limits of this technique for the stochastic block model with two balanced communities.

preprint2022arXiv

Dual bounds for the positive definite functions approach to mutually unbiased bases

A long-standing open problem asks if there can exist 7 mutually unbiased bases (MUBs) in $\mathbb{C}^6$, or, more generally, $d + 1$ MUBs in $\mathbb{C}^d$ for any $d$ that is not a prime power. The recent work of Kolountzakis, Matolcsi, and Weiner (2016) proposed an application of the method of positive definite functions (a relative of Delsarte's method in coding theory and Lovász's semidefinite programming relaxation of the independent set problem) as a means of answering this question in the negative. Namely, they ask whether there exists a polynomial of a unitary matrix input satisfying various properties which, through the method of positive definite functions, would show the non-existence of 7 MUBs in $\mathbb{C}^6$. Using a convex duality argument, we prove that such a polynomial of degree at most 6 cannot exist. We also propose a general dual certificate which we conjecture to certify that this method can never show that there exist strictly fewer than $d + 1$ MUBs in $\mathbb{C}^d$.

preprint2022arXiv

Subexponential-Time Algorithms for Sparse PCA

We study the computational cost of recovering a unit-norm sparse principal component $x \in \mathbb{R}^n$ planted in a random matrix, in either the Wigner or Wishart spiked model (observing either $W + λxx^\top$ with $W$ drawn from the Gaussian orthogonal ensemble, or $N$ independent samples from $\mathcal{N}(0, I_n + βxx^\top)$, respectively). Prior work has shown that when the signal-to-noise ratio ($λ$ or $β\sqrt{N/n}$, respectively) is a small constant and the fraction of nonzero entries in the planted vector is $\|x\|_0 / n = ρ$, it is possible to recover $x$ in polynomial time if $ρ\lesssim 1/\sqrt{n}$. While it is possible to recover $x$ in exponential time under the weaker condition $ρ\ll 1$, it is believed that polynomial-time recovery is impossible unless $ρ\lesssim 1/\sqrt{n}$. We investigate the precise amount of time required for recovery in the "possible but hard" regime $1/\sqrt{n} \ll ρ\ll 1$ by exploring the power of subexponential-time algorithms, i.e., algorithms running in time $\exp(n^δ)$ for some constant $δ\in (0,1)$. For any $1/\sqrt{n} \ll ρ\ll 1$, we give a recovery algorithm with runtime roughly $\exp(ρ^2 n)$, demonstrating a smooth tradeoff between sparsity and runtime. Our family of algorithms interpolates smoothly between two existing algorithms: the polynomial-time diagonal thresholding algorithm and the $\exp(ρn)$-time exhaustive search algorithm. Furthermore, by analyzing the low-degree likelihood ratio, we give rigorous evidence suggesting that the tradeoff achieved by our algorithms is optimal.

preprint2020arXiv

Spectral Planting and the Hardness of Refuting Cuts, Colorability, and Communities in Random Graphs

We study the problem of efficiently refuting the k-colorability of a graph, or equivalently certifying a lower bound on its chromatic number. We give formal evidence of average-case computational hardness for this problem in sparse random regular graphs, showing optimality of a simple spectral certificate. This evidence takes the form of a computationally-quiet planting: we construct a distribution of d-regular graphs that has significantly smaller chromatic number than a typical regular graph drawn uniformly at random, while providing evidence that these two distributions are indistinguishable by a large class of algorithms. We generalize our results to the more general problem of certifying an upper bound on the maximum k-cut. This quiet planting is achieved by minimizing the effect of the planted structure (e.g. colorings or cuts) on the graph spectrum. Specifically, the planted structure corresponds exactly to eigenvectors of the adjacency matrix. This avoids the pushout effect of random matrix theory, and delays the point at which the planting becomes visible in the spectrum or local statistics. To illustrate this further, we give similar results for a Gaussian analogue of this problem: a quiet version of the spiked model, where we plant an eigenspace rather than adding a generic low-rank perturbation. Our evidence for computational hardness of distinguishing two distributions is based on three different heuristics: stability of belief propagation, the local statistics hierarchy, and the low-degree likelihood ratio. Of independent interest, our results include general-purpose bounds on the low-degree likelihood ratio for multi-spiked matrix models, and an improved low-degree analysis of the stochastic block model.

preprint2020arXiv

Spurious Valleys in Two-layer Neural Network Optimization Landscapes

Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite their non-convexity, gradient-descent methods often successfully optimize these models. This has motivated a recent spur in research attempting to characterize properties of their loss surface that may explain such success. In this paper, we address this phenomenon by studying a key topological property of the loss: the presence or absence of spurious valleys, defined as connected components of sub-level sets that do not include a global minimum. Focusing on a class of two-layer neural networks defined by smooth (but generally non-linear) activation functions, we identify a notion of intrinsic dimension and show that it provides necessary and sufficient conditions for the absence of spurious valleys. More concretely, finite intrinsic dimension guarantees that for sufficiently overparametrised models no spurious valleys exist, independently of the data distribution. Conversely, infinite intrinsic dimension implies that spurious valleys do exist for certain data distributions, independently of model overparametrisation. Besides these positive and negative results, we show that, although spurious valleys may exist in general, they are confined to low risk levels and avoided with high probability on overparametrised models.

preprint2014arXiv

Compressive classification and the rare eclipse problem

This paper addresses the fundamental question of when convex sets remain disjoint after random projection. We provide an analysis using ideas from high-dimensional convex geometry. For ellipsoids, we provide a bound in terms of the distance between these ellipsoids and simple functions of their polynomial coefficients. As an application, this theorem provides bounds for compressive classification of convex sets. Rather than assuming that the data to be classified is sparse, our results show that the data can be acquired via very few measurements yet will remain linearly separable. We demonstrate the feasibility of this approach in the context of hyperspectral imaging.

preprint2014arXiv

Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery

We consider the problem of clustering a graph $G$ into two communities by observing a subset of the vertex correlations. Specifically, we consider the inverse problem with observed variables $Y=B_G x \oplus Z$, where $B_G$ is the incidence matrix of a graph $G$, $x$ is the vector of unknown vertex variables (with a uniform prior) and $Z$ is a noise vector with Bernoulli$(\varepsilon)$ i.i.d. entries. All variables and operations are Boolean. This model is motivated by coding, synchronization, and community detection problems. In particular, it corresponds to a stochastic block model or a correlation clustering problem with two communities and censored edges. Without noise, exact recovery (up to global flip) of $x$ is possible if and only the graph $G$ is connected, with a sharp threshold at the edge probability $\log(n)/n$ for Erdős-Rényi random graphs. The first goal of this paper is to determine how the edge probability $p$ needs to scale to allow exact recovery in the presence of noise. Defining the degree (oversampling) rate of the graph by $α=np/\log(n)$, it is shown that exact recovery is possible if and only if $α>2/(1-2\varepsilon)^2+ o(1/(1-2\varepsilon)^2)$. In other words, $2/(1-2\varepsilon)^2$ is the information theoretic threshold for exact recovery at low-SNR. In addition, an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate. For a deterministic graph $G$, defining the degree rate as $α=d/\log(n)$, where $d$ is the minimum degree of the graph, it is shown that the proposed method achieves the rate $α> 4((1+λ)/(1-λ)^2)/(1-2\varepsilon)^2+ o(1/(1-2\varepsilon)^2)$, where $1-λ$ is the spectral gap of the graph $G$.

preprint2014arXiv

Derandomizing restricted isometries via the Legendre symbol

The restricted isometry property (RIP) is an important matrix condition in compressed sensing, but the best matrix constructions to date use randomness. This paper leverages pseudorandom properties of the Legendre symbol to reduce the number of random bits in an RIP matrix with Bernoulli entries. In this regard, the Legendre symbol is not special---our main result naturally generalizes to any small-bias sample space. We also conjecture that no random bits are necessary for our Legendre symbol--based construction.

preprint2014arXiv

Open problem: Tightness of maximum likelihood semidefinite relaxations

We have observed an interesting, yet unexplained, phenomenon: Semidefinite programming (SDP) based relaxations of maximum likelihood estimators (MLE) tend to be tight in recovery problems with noisy data, even when MLE cannot exactly recover the ground truth. Several results establish tightness of SDP based relaxations in the regime where exact recovery from MLE is possible. However, to the best of our knowledge, their tightness is not understood beyond this regime. As an illustrative example, we focus on the generalized Procrustes problem.

preprint2013arXiv

A Cheeger Inequality for the Graph Connection Laplacian

The O(d) Synchronization problem consists of estimating a set of unknown orthogonal transformations O_i from noisy measurements of a subset of the pairwise ratios O_iO_j^{-1}. We formulate and prove a Cheeger-type inequality that relates a measure of how well it is possible to solve the O(d) synchronization problem with the spectra of an operator, the graph Connection Laplacian. We also show how this inequality provides a worst case performance guarantee for a spectral method to solve this problem.

preprint2013arXiv

Computation of sparse low degree interpolating polynomials and their application to derivative-free optimization

Interpolation-based trust-region methods are an important class of algorithms for Derivative-Free Optimization which rely on locally approximating an objective function by quadratic polynomial interpolation models, frequently built from less points than there are basis components. Often, in practical applications, the contribution of the problem variables to the objective function is such that many pairwise correlations between variables are negligible, implying, in the smooth case, a sparse structure in the Hessian matrix. To be able to exploit Hessian sparsity, existing optimization approaches require the knowledge of the sparsity structure. The goal of this paper is to develop and analyze a method where the sparse models are constructed automatically. The sparse recovery theory developed recently in the field of compressed sensing characterizes conditions under which a sparse vector can be accurately recovered from few random measurements. Such a recovery is achieved by minimizing the l1-norm of a vector subject to the measurements constraints. We suggest an approach for building sparse quadratic polynomial interpolation models by minimizing the l1-norm of the entries of the model Hessian subject to the interpolation conditions. We show that this procedure recovers accurate models when the function Hessian is sparse, using relatively few randomly selected sample points. Motivated by this result, we developed a practical interpolation-based trust-region method using deterministic sample sets and minimum l1-norm quadratic models. Our computational results show that the new approach exhibits a promising numerical performance both in the general case and in the sparse one.

preprint2013arXiv

Convergence of trust-region methods based on probabilistic models

In this paper we consider the use of probabilistic or random models within a classical trust-region framework for optimization of deterministic smooth general nonlinear functions. Our method and setting differs from many stochastic optimization approaches in two principal ways. Firstly, we assume that the value of the function itself can be computed without noise, in other words, that the function is deterministic. Secondly, we use random models of higher quality than those produced by usual stochastic gradient methods. In particular, a first order model based on random approximation of the gradient is required to provide sufficient quality of approximation with probability greater than or equal to 1/2. This is in contrast with stochastic gradient approaches, where the model is assumed to be "correct" only in expectation. As a result of this particular setting, we are able to prove convergence, with probability one, of a trust-region method which is almost identical to the classical method. Hence we show that a standard optimization framework can be used in cases when models are random and may or may not provide good approximations, as long as "good" models are more likely than "bad" models. Our results are based on the use of properties of martingales. Our motivation comes from using random sample sets and interpolation models in derivative-free optimization. However, our framework is general and can be applied with any source of uncertainty in the model. We discuss various applications for our methods in the paper.

preprint2013arXiv

Multireference Alignment using Semidefinite Programming

The multireference alignment problem consists of estimating a signal from multiple noisy shifted observations. Inspired by existing Unique-Games approximation algorithms, we provide a semidefinite program (SDP) based relaxation which approximates the maximum likelihood estimator (MLE) for the multireference alignment problem. Although we show that the MLE problem is Unique-Games hard to approximate within any constant, we observe that our poly-time approximation algorithm for the MLE appears to perform quite well in typical instances, outperforming existing methods. In an attempt to explain this behavior we provide stability guarantees for our SDP under a random noise model on the observations. This case is more challenging to analyze than traditional semi-random instances of Unique-Games: the noise model is on vertices of a graph and translates into dependent noise on the edges. Interestingly, we show that if certain positivity constraints in the SDP are dropped, its solution becomes equivalent to performing phase correlation, a popular method used for pairwise alignment in imaging applications. Finally, we show how symmetry reduction techniques from matrix representation theory can simplify the analysis and computation of the SDP, greatly decreasing its computational cost.

preprint2013arXiv

On partial sparse recovery

We consider the problem of recovering a partially sparse solution of an underdetermined system of linear equations by minimizing the $\ell_1$-norm of the part of the solution vector which is known to be sparse. Such a problem is closely related to a classical problem in Compressed Sensing where the $\ell_1$-norm of the whole solution vector is minimized. We introduce analogues of restricted isometry and null space properties for the recovery of partially sparse vectors and show that these new properties are implied by their original counterparts. We show also how to extend recovery under noisy measurements to the partially sparse case.

preprint2013arXiv

Phase retrieval from power spectra of masked signals

In diffraction imaging, one is tasked with reconstructing a signal from its power spectrum. To resolve the ambiguity in this inverse problem, one might invoke prior knowledge about the signal, but phase retrieval algorithms in this vein have found limited success. One alternative is to create redundancy in the measurement process by illuminating the signal multiple times, distorting the signal each time with a different mask. Despite several recent advances in phase retrieval, the community has yet to construct an ensemble of masks which uniquely determines all signals and admits an efficient reconstruction algorithm. In this paper, we leverage the recently proposed polarization method to construct such an ensemble. We also present numerical simulations to illustrate the stability of the polarization method in this setting. In comparison to a state-of-the-art phase retrieval algorithm known as PhaseLift, we find that polarization is much faster with comparable stability.

preprint2013arXiv

Phase retrieval with polarization

In many areas of imaging science, it is difficult to measure the phase of linear measurements. As such, one often wishes to reconstruct a signal from intensity measurements, that is, perform phase retrieval. In this paper, we provide a novel measurement design which is inspired by interferometry and exploits certain properties of expander graphs. We also give an efficient phase retrieval procedure, and use recent results in spectral graph theory to produce a stable performance guarantee which rivals the guarantee for PhaseLift in [Candes et al. 2011]. We use numerical simulations to illustrate the performance of our phase retrieval procedure, and we compare reconstruction error and runtime with a common alternating-projections-type procedure.

preprint2013arXiv

Saving phase: Injectivity and stability for phase retrieval

Recent advances in convex optimization have led to new strides in the phase retrieval problem over finite-dimensional vector spaces. However, certain fundamental questions remain: What sorts of measurement vectors uniquely determine every signal up to a global phase factor, and how many are needed to do so? Furthermore, which measurement ensembles lend stability? This paper presents several results that address each of these questions. We begin by characterizing injectivity, and we identify that the complement property is indeed a necessary condition in the complex case. We then pose a conjecture that 4M-4 generic measurement vectors are both necessary and sufficient for injectivity in M dimensions, and we prove this conjecture in the special cases where M=2,3. Next, we shift our attention to stability, both in the worst and average cases. Here, we characterize worst-case stability in the real case by introducing a numerical version of the complement property. This new property bears some resemblance to the restricted isometry property of compressed sensing and can be used to derive a sharp lower Lipschitz bound on the intensity measurement mapping. Localized frames are shown to lack this property (suggesting instability), whereas Gaussian random measurements are shown to satisfy this property with high probability. We conclude by presenting results that use a stochastic noise model in both the real and complex cases, and we leverage Cramer-Rao lower bounds to identify stability with stronger versions of the injectivity characterizations.

preprint2012arXiv

The road to deterministic matrices with the restricted isometry property

The restricted isometry property (RIP) is a well-known matrix condition that provides state-of-the-art reconstruction guarantees for compressed sensing. While random matrices are known to satisfy this property with high probability, deterministic constructions have found less success. In this paper, we consider various techniques for demonstrating RIP deterministically, some popular and some novel, and we evaluate their performance. In evaluating some techniques, we apply random matrix theory and inadvertently find a simple alternative proof that certain random matrices are RIP. Later, we propose a particular class of matrices as candidates for being RIP, namely, equiangular tight frames (ETFs). Using the known correspondence between real ETFs and strongly regular graphs, we investigate certain combinatorial implications of a real ETF being RIP. Specifically, we give probabilistic intuition for a new bound on the clique number of Paley graphs of prime order, and we conjecture that the corresponding ETFs are RIP in a manner similar to random matrices.

preprint2011arXiv

Landau's necessary density conditions for the Hankel transform

We will prove an analogue of Landau's necessary conditions [Necessary density conditions for sampling and interpolation of certain entire functions, Acta Math. 117 (1967).] for spaces of functions whose Hankel transform is supported in a measurable subset S of the positive semi-axis. As a special case, necessary density conditions for the existence of Fourier-Bessel frames are obtained.