Source author record

Alexandre d'Aspremont

Alexandre d'Aspremont appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning math.NA Artificial Intelligence math.ST Numerical Analysis Statistics Theory Applications Data Structures and Algorithms math.PR Methodology physics.ao-ph physics.optics q-fin.ST

Catalog footprint

What is connected

33works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention

We address the problem of learning on sets of features, motivated by the need of performing pooling operations in long biological sequences of varying sizes, with long-range dependencies, and possibly few labeled data. To address this challenging task, we introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost. Our aggregation technique admits two useful interpretations: it may be seen as a mechanism related to attention layers in neural networks, or it may be seen as a scalable surrogate of a classical optimal transport-based kernel. We experimentally demonstrate the effectiveness of our approach on biological sequences, achieving state-of-the-art results for protein fold recognition and detection of chromatin profiles tasks, and, as a proof of concept, we show promising results for processing natural language sequences. We provide an open-source implementation of our embedding that can be used alone or as a module in larger learning models at https://github.com/claying/OTK.

preprint2021arXiv

Approximation Bounds for Sparse Programs

We show that sparsity constrained optimization problems over low dimensional spaces tend to have a small duality gap. We use the Shapley-Folkman theorem to derive both data-driven bounds on the duality gap, and an efficient primalization procedure to recover feasible points satisfying these bounds. These error bounds are proportional to the rate of growth of the objective with the target cardinality, which means in particular that the relaxation is nearly tight as soon as the target cardinality is large enough so that only uninformative features are added.

preprint2021arXiv

Global Assessment of Oil and Gas Methane Ultra-Emitters

Methane emissions from oil and gas (O&G) production and transmission represent a significant contribution to climate change. These emissions comprise sporadic releases of large amounts of methane during maintenance operations or equipment failures not accounted for in current inventory estimates. We collected and analyzed hundreds of very large releases from atmospheric methane images sampled by the TROPOspheric Monitoring Instrument (TROPOMI) over 2019 and 2020 to quantify emissions from O&G ultra-emitters. Ultra-emitters are primarily detected over the largest O&G basins of the world, following a power-law relationship with noticeable variations across countries but similar regression slopes. With a total contribution equivalent to 8-12% of the global O&G production methane emissions, mitigation of ultra-emitters is largely achievable at low costs and would lead to robust net benefits in billions of US dollars for the six major producing countries when incorporating recent estimates of societal costs of methane.

preprint2021arXiv

Local and Global Uniform Convexity Conditions

We review various characterizations of uniform convexity and smoothness on norm balls in finite-dimensional spaces and connect results stemming from the geometry of Banach spaces with \textit{scaling inequalities} used in analysing the convergence of optimization methods. In particular, we establish local versions of these conditions to provide sharper insights on a recent body of complexity results in learning theory, online learning, or offline optimization, which rely on the strong convexity of the feasible set. While they have a significant impact on complexity, these strong convexity or uniform convexity properties of feasible sets are not exploited as thoroughly as their functional counterparts, and this work is an effort to correct this imbalance. We conclude with some practical examples in optimization and machine learning where leveraging these conditions and localized assumptions lead to new complexity results.

preprint2021arXiv

Optimal Complexity and Certification of Bregman First-Order Methods

We provide a lower bound showing that the $O(1/k)$ convergence rate of the NoLips method (a.k.a. Bregman Gradient) is optimal for the class of functions satisfying the $h$-smoothness assumption. This assumption, also known as relative smoothness, appeared in the recent developments around the Bregman Gradient method, where acceleration remained an open issue. On the way, we show how to constructively obtain the corresponding worst-case functions by extending the computer-assisted performance estimation framework of Drori and Teboulle (Mathematical Programming, 2014) to Bregman first-order methods, and to handle the classes of differentiable and strictly convex functions.

preprint2021arXiv

Quartic First-Order Methods for Low-Rank Minimization

We study a generalized nonconvex Burer-Monteiro formulation for low-rank minimization problems. We use recent results on non-Euclidean first order methods to provide efficient and scalable algorithms. Our approach uses geometries induced by quartic kernels on matrix spaces; for unconstrained cases we introduce a novel family of Gram kernels that considerably improves numerical performances. Numerical experiments for Euclidean distance matrix completion and symmetric nonnegative matrix factorization show that our algorithms scale well and reach state of the art performance when compared to specialized methods.

preprint2020arXiv

Complexity Guarantees for Polyak Steps with Momentum

In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.

preprint2020arXiv

FANOK: Knockoffs in Linear Time

We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. Identifying the knockoff distribution requires solving a large scale semidefinite program for which we derive several efficient methods. One handles generic covariance matrices, has a complexity scaling as $O(p^3)$ where $p$ is the ambient dimension, while another assumes a rank $k$ factor model on the covariance matrix to reduce this complexity bound to $O(pk^2)$. We also derive efficient procedures to both estimate factor models and sample knockoff covariates with complexity linear in the dimension. We test our methods on problems with $p$ as large as $500,000$.

preprint2020arXiv

Global Convergence of Frank Wolfe on One Hidden Layer Networks

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.

preprint2020arXiv

Projection-Free Optimization on Uniformly Convex Sets

The Frank-Wolfe method solves smooth constrained convex optimization problems at a generic sublinear rate of $\mathcal{O}(1/T)$, and it (or its variants) enjoys accelerated convergence rates for two fundamental classes of constraints: polytopes and strongly-convex sets. Uniformly convex sets non-trivially subsume strongly convex sets and form a large variety of \textit{curved} convex sets commonly encountered in machine learning and signal processing. For instance, the $\ell_p$-balls are uniformly convex for all $p > 1$, but strongly convex for $p\in]1,2]$ only. We show that these sets systematically induce accelerated convergence rates for the original Frank-Wolfe algorithm, which continuously interpolate between known rates. Our accelerated convergence rates emphasize that it is the curvature of the constraint sets -- not just their strong convexity -- that leads to accelerated convergence rates. These results also importantly highlight that the Frank-Wolfe algorithm is adaptive to much more generic constraint set structures, thus explaining faster empirical convergence. Finally, we also show accelerated convergence rates when the set is only locally uniformly convex and provide similar results in online linear optimization.

preprint2020arXiv

Ranking and synchronization from pairwise measurements via SVD

Given a measurement graph $G= (V,E)$ and an unknown signal $r \in \mathbb{R}^n$, we investigate algorithms for recovering $r$ from pairwise measurements of the form $r_i - r_j$; $\{i,j\} \in E$. This problem arises in a variety of applications, such as ranking teams in sports data and time synchronization of distributed networks. Framed in the context of ranking, the task is to recover the ranking of $n$ teams (induced by $r$) given a small subset of noisy pairwise rank offsets. We propose a simple SVD-based algorithmic pipeline for both the problem of time synchronization and ranking. We provide a detailed theoretical analysis in terms of robustness against both sampling sparsity and noise perturbations with outliers, using results from matrix perturbation and random matrix theory. Our theoretical findings are complemented by a detailed set of numerical experiments on both synthetic and real data, showcasing the competitiveness of our proposed algorithms with other state-of-the-art methods.

preprint2020arXiv

Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport

Estimating Wasserstein distances between two high-dimensional densities suffers from the curse of dimensionality: one needs an exponential (wrt dimension) number of samples to ensure that the distance between two empirical measures is comparable to the distance between the original densities. Therefore, optimal transport (OT) can only be used in machine learning if it is substantially regularized. On the other hand, one of the greatest achievements of the OT literature in recent years lies in regularity theory: Caffarelli showed that the OT map between two well behaved measures is Lipschitz, or equivalently when considering 2-Wasserstein distances, that Brenier convex potentials (whose gradient yields an optimal map) are smooth. We propose in this work to draw inspiration from this theory and use regularity as a regularization tool. We give algorithms operating on two discrete measures that can recover nearly optimal transport maps with small distortion, or equivalently, nearly optimal Brenier potentials that are strongly convex and smooth. The problem boils down to solving alternatively a convex QCQP and a discrete OT problem, granting access to the values and gradients of the Brenier potential not only on sampled points, but also out of sample at the cost of solving a simpler QCQP for each evaluation. We propose algorithms to estimate and evaluate transport maps with desired regularity properties, benchmark their statistical performance, apply them to domain adaptation and visualize their action on a color transfer task.

preprint2020arXiv

Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions

We design simple screening tests to automatically discard data samples in empirical risk minimization without losing optimization guarantees. We derive loss functions that produce dual objectives with a sparse solution. We also show how to regularize convex losses to ensure such a dual sparsity-inducing property, and propose a general method to design screening tests for classification or regression based on ellipsoidal approximations of the optimal set. In addition to producing computational gains, our approach also allows us to compress a dataset into a subset of representative points.

preprint2016arXiv

An Optimal Affine Invariant Smooth Minimization Algorithm

We formulate an affine invariant implementation of the accelerated first-order algorithm in Nesterov (1983). Its complexity bound is proportional to an affine invariant regularity constant defined with respect to the Minkowski gauge of the feasible set. We extend these results to more general problems, optimizing Hölder smooth functions using $p$-uniformly convex prox terms, and derive an algorithm whose complexity better fits the geometry of the feasible set and adapts to both the best Hölder smoothness parameter and the best gradient Lipschitz constant. Finally, we detail matching complexity lower bounds when the feasible set is an $\ell_p$ ball. In this setting, our upper bounds on iteration complexity for the algorithm in Nesterov (1983) are thus optimal in terms of target precision, smoothness and problem dimension.

preprint2016arXiv

Learning with Clustering Structure

We study supervised learning problems using clustering constraints to impose structure on either features or samples, seeking to help both prediction and interpretation. The problem of clustering features arises naturally in text classification for instance, to reduce dimensionality by grouping words together and identify synonyms. The sample clustering problem on the other hand, applies to multiclass problems where we are allowed to make multiple predictions and the performance of the best answer is recorded. We derive a unified optimization formulation highlighting the common structure of these problems and produce algorithms whose core iteration complexity amounts to a k-means clustering step, which can be approximated efficiently. We extend these results to combine sparsity and clustering constraints, and develop a new projection algorithm on the set of clustered sparse vectors. We prove convergence of our algorithms on random instances, based on a union of subspaces interpretation of the clustering structure. Finally, we test the robustness of our methods on artificial data sets as well as real data extracted from movie reviews.

preprint2016arXiv

Spectral Ranking using Seriation

We describe a seriation algorithm for ranking a set of items given pairwise comparisons between these items. Intuitively, the algorithm assigns similar rankings to items that compare similarly with all others. It does so by constructing a similarity matrix from pairwise comparisons, using seriation methods to reorder this matrix and construct a ranking. We first show that this spectral seriation algorithm recovers the true ranking when all pairwise comparisons are observed and consistent with a total order. We then show that ranking reconstruction is still exact when some pairwise comparisons are corrupted or missing, and that seriation based spectral ranking is more robust to noise than classical scoring methods. Finally, we bound the ranking error when only a random subset of the comparions are observed. An additional benefit of the seriation formulation is that it allows us to solve semi-supervised ranking problems. Experiments on both synthetic and real datasets demonstrate that seriation based spectral ranking achieves competitive and in some cases superior performance compared to classical ranking methods.

preprint2015arXiv

Coherent Diffractive Imaging Using Randomly Coded Masks

Coherent diffractive imaging (CDI) provides new opportunities for high resolution X-ray imaging with simultaneous amplitude and phase contrast. Extensions to CDI broaden the scope of the technique for use in a wide variety of experimental geometries and physical systems. Here, we experimentally demonstrate a new extension to CDI that encodes additional information through the use of a series of randomly coded masks. The information gained from the few additional diffraction measurements removes the need for typical object-domain constraints; the algorithm uses prior information about the masks instead. The experiment is performed using a laser diode at 532.2 nm, enabling rapid prototyping for future X-ray synchrotron and even free electron laser experiments. Diffraction patterns are collected with up to 15 different masks placed between a CCD detector and a single sample. Phase retrieval is performed using a convex relaxation routine known as "PhaseCut" followed by a variation on Fienup's input-output algorithm. The reconstruction quality is judged via calculation of phase retrieval transfer functions as well as by an object-space comparison between reconstructions and a lens-based image of the sample. The results of this analysis indicate that with enough masks (in this case 3 or 4) the diffraction phases converge reliably, implying stability and uniqueness of the retrieved solution.

preprint2015arXiv

Convex Relaxations for Permutation Problems

Seriation seeks to reconstruct a linear order between variables using unsorted, pairwise similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We write seriation as an optimization problem by proving the equivalence between the seriation and combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic minimization problem over permutations). The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we derive several convex relaxations for 2-SUM to improve the robustness of seriation solutions in noisy settings. These convex relaxations also allow us to impose structural constraints on the solution, hence solve semi-supervised seriation problems. We derive new approximation bounds for some of these relaxations and present numerical experiments on archeological data, Markov chains and DNA assembly from shotgun gene sequencing data.

preprint2015arXiv

Mean-Reverting Portfolios: Tradeoffs Between Sparsity and Volatility

Mean-reverting assets are one of the holy grails of financial markets: if such assets existed, they would provide trivially profitable investment strategies for any investor able to trade them, thanks to the knowledge that such assets oscillate predictably around their long term mean. The modus operandi of cointegration-based trading strategies [Tsay, 2005, §8] is to create first a portfolio of assets whose aggregate value mean-reverts, to exploit that knowledge by selling short or buying that portfolio when its value deviates from its long-term mean. Such portfolios are typically selected using tools from cointegration theory [Engle and Granger, 1987, Johansen, 1991], whose aim is to detect combinations of assets that are stationary, and therefore mean-reverting. We argue in this work that focusing on stationarity only may not suffice to ensure profitability of cointegration-based strategies. While it might be possible to create syn- thetically, using a large array of financial assets, a portfolio whose aggre- gate value is stationary and therefore mean-reverting, trading such a large portfolio incurs in practice important trade or borrow costs. Looking for stationary portfolios formed by many assets may also result in portfolios that have a very small volatility and which require significant leverage to be profitable. We study in this work algorithmic approaches that can take mitigate these effects by searching for maximally mean-reverting portfo- lios which are sufficiently sparse and/or volatile.

preprint2014arXiv

A Stochastic Smoothing Algorithm for Semidefinite Programming

We use a rank one Gaussian perturbation to derive a smooth stochastic approximation of the maximum eigenvalue function. We then combine this smoothing result with an optimal smooth stochastic optimization algorithm to produce an efficient method for solving maximum eigenvalue minimization problems. We show that the complexity of this new method is lower than that of deterministic smoothing algorithms in certain precision/dimension regimes.

preprint2014arXiv

Phase retrieval for imaging problems

We study convex relaxation algorithms for phase retrieval on imaging problems. We show that structural assumptions on the signal and the observations, such as sparsity, smoothness or positivity, can be exploited to both speed-up convergence and improve recovery performance. We detail experimental results in molecular imaging problems simulated from PDB data.

preprint2013arXiv

Phase Recovery, MaxCut and Complex Semidefinite Programming

Phase retrieval seeks to recover a signal x from the amplitude |Ax| of linear measurements. We cast the phase retrieval problem as a non-convex quadratic program over a complex phase vector and formulate a tractable relaxation (called PhaseCut) similar to the classical MaxCut semidefinite program. We solve this problem using a provably convergent block coordinate descent algorithm whose structure is similar to that of the original greedy algorithm in Gerchberg-Saxton, where each iteration is a matrix vector product. Numerical results show the performance of this approach over three different phase retrieval problems, in comparison with greedy phase retrieval algorithms and matrix completion formulations.

preprint2012arXiv

Approximation Bounds for Sparse Principal Component Analysis

We produce approximation bounds on a semidefinite programming relaxation for sparse principal component analysis. These bounds control approximation ratios for tractable statistics in hypothesis testing problems where data points are sampled from Gaussian models with a single sparse leading component.

preprint2012arXiv

Convex Algorithms for Nonnegative Matrix Factorization

We derive approximation algorithms for the nonnegative matrix factorization problem, i.e. the problem of factorizing a matrix as the product of two matrices with nonnegative coefficients. We form convex approximations of this problem which can be solved efficiently and test our algorithms on some classic numerical examples.

preprint2012arXiv

Weak Recovery Conditions from Graph Partitioning Bounds and Order Statistics

We study a weaker formulation of the nullspace property which guarantees recovery of sparse signals from linear measurements by l_1 minimization. We require this condition to hold only with high probability, given a distribution on the nullspace of the coding matrix A. Under some assumptions on the distribution of the reconstruction error, we show that testing these weak conditions means bounding the optimal value of two classical graph partitioning problems: the k-Dense-Subgraph and MaxCut problems. Both problems admit efficient, relatively tight relaxations and we use a randomization argument to produce new approximation bounds for k-Dense-Subgraph. We test the performance of our results on several families of coding matrices.

preprint2011arXiv

Sparse Recovery, Kashin Decomposition and Conic Programming

We produce relaxation bounds on the diameter of arbitrary sections of the l1 ball in R^n. We use these results to test conditions for sparse recovery.

preprint2011arXiv

Subsampling Algorithms for Semidefinite Programming

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.

preprint2010arXiv

A Pathwise Algorithm for Covariance Selection

Covariance selection seeks to estimate a covariance matrix by maximum likelihood while restricting the number of nonzero inverse covariance matrix coefficients. A single penalty parameter usually controls the tradeoff between log likelihood and sparsity in the inverse matrix. We describe an efficient algorithm for computing a full regularization path of solutions to this problem.

preprint2010arXiv

Convex Relaxations for Subset Selection

We use convex relaxation techniques to produce lower bounds on the optimal value of subset selection problems and generate good approximate solutions. We then explicitly bound the quality of these relaxations by studying the approximation ratio of sparse eigenvalue relaxations. Our results are used to improve the performance of branch-and-bound algorithms to produce exact solutions to subset selection problems.

preprint2010arXiv

Second order accurate distributed eigenvector computation for extremely large matrices

We propose a second-order accurate method to estimate the eigenvectors of extremely large matrices thereby addressing a problem of relevance to statisticians working in the analysis of very large datasets. More specifically, we show that averaging eigenvectors of randomly subsampled matrices efficiently approximates the true eigenvectors of the original matrix under certain conditions on the incoherence of the spectral decomposition. This incoherence assumption is typically milder than those made in matrix completion and allows eigenvectors to be sparse. We discuss applications to spectral methods in dimensionality reduction and information retrieval.

preprint2010arXiv

Sparse PCA: Convex Relaxations, Algorithms and Applications

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. Unfortunately, this problem is also combinatorially hard and we discuss convex relaxation techniques that efficiently produce good approximate solutions. We then describe several algorithms solving these relaxations as well as greedy algorithms that iteratively improve the solution quality. Finally, we illustrate sparse PCA in several applications, ranging from senate voting and finance to news data.

preprint2010arXiv

Testing the Nullspace Property using Semidefinite Programming

Recent results in compressed sensing show that, under certain conditions, the sparsest solution to an underdetermined set of linear equations can be recovered by solving a linear program. These results either rely on computing sparse eigenvalues of the design matrix or on properties of its nullspace. So far, no tractable algorithm is known to test these conditions and most current results rely on asymptotic properties of random matrices. Given a matrix A, we use semidefinite relaxation techniques to test the nullspace property on A and show on some numerical examples that these relaxation bounds can prove perfect recovery of sparse solutions with relatively high cardinality.

preprint2007arXiv

Optimal Solutions for Sparse Principal Component Analysis

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. We formulate a new semidefinite relaxation to this problem and derive a greedy algorithm that computes a full set of good solutions for all target numbers of non zero coefficients, with total complexity O(n^3), where n is the number of variables. We then use the same relaxation to derive sufficient conditions for global optimality of a solution, which can be tested in O(n^3) per pattern. We discuss applications in subset selection and sparse recovery and show on artificial examples and biological data that our algorithm does provide globally optimal solutions in many cases.

Alexandre d'Aspremont

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention

Approximation Bounds for Sparse Programs

Global Assessment of Oil and Gas Methane Ultra-Emitters

Local and Global Uniform Convexity Conditions

Optimal Complexity and Certification of Bregman First-Order Methods

Quartic First-Order Methods for Low-Rank Minimization

Complexity Guarantees for Polyak Steps with Momentum

FANOK: Knockoffs in Linear Time

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Projection-Free Optimization on Uniformly Convex Sets

Ranking and synchronization from pairwise measurements via SVD

Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport

Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions

An Optimal Affine Invariant Smooth Minimization Algorithm

Learning with Clustering Structure

Spectral Ranking using Seriation

Coherent Diffractive Imaging Using Randomly Coded Masks

Convex Relaxations for Permutation Problems

Mean-Reverting Portfolios: Tradeoffs Between Sparsity and Volatility

A Stochastic Smoothing Algorithm for Semidefinite Programming

Phase retrieval for imaging problems

Phase Recovery, MaxCut and Complex Semidefinite Programming

Approximation Bounds for Sparse Principal Component Analysis

Convex Algorithms for Nonnegative Matrix Factorization

Weak Recovery Conditions from Graph Partitioning Bounds and Order Statistics

Sparse Recovery, Kashin Decomposition and Conic Programming

Subsampling Algorithms for Semidefinite Programming

A Pathwise Algorithm for Covariance Selection

Convex Relaxations for Subset Selection

Second order accurate distributed eigenvector computation for extremely large matrices

Sparse PCA: Convex Relaxations, Algorithms and Applications

Testing the Nullspace Property using Semidefinite Programming

Optimal Solutions for Sparse Principal Component Analysis

Alexandre d&#39;Aspremont

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention

Approximation Bounds for Sparse Programs

Global Assessment of Oil and Gas Methane Ultra-Emitters

Local and Global Uniform Convexity Conditions

Optimal Complexity and Certification of Bregman First-Order Methods

Quartic First-Order Methods for Low-Rank Minimization

Complexity Guarantees for Polyak Steps with Momentum

FANOK: Knockoffs in Linear Time

Global Convergence of Frank Wolfe on One Hidden Layer Networks

Projection-Free Optimization on Uniformly Convex Sets

Ranking and synchronization from pairwise measurements via SVD

Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport

Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions

An Optimal Affine Invariant Smooth Minimization Algorithm

Learning with Clustering Structure

Spectral Ranking using Seriation

Coherent Diffractive Imaging Using Randomly Coded Masks

Convex Relaxations for Permutation Problems

Mean-Reverting Portfolios: Tradeoffs Between Sparsity and Volatility

A Stochastic Smoothing Algorithm for Semidefinite Programming

Phase retrieval for imaging problems

Phase Recovery, MaxCut and Complex Semidefinite Programming

Approximation Bounds for Sparse Principal Component Analysis

Convex Algorithms for Nonnegative Matrix Factorization

Weak Recovery Conditions from Graph Partitioning Bounds and Order Statistics

Sparse Recovery, Kashin Decomposition and Conic Programming

Subsampling Algorithms for Semidefinite Programming

A Pathwise Algorithm for Covariance Selection

Convex Relaxations for Subset Selection

Second order accurate distributed eigenvector computation for extremely large matrices

Sparse PCA: Convex Relaxations, Algorithms and Applications

Testing the Nullspace Property using Semidefinite Programming

Optimal Solutions for Sparse Principal Component Analysis

Alexandre d'Aspremont