Source author record

Coralia Cartis

Coralia Cartis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Information Theory Machine Learning math.IT math.NA

Catalog footprint

What is connected

15works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Parameter-Free First-Order Algorithm for Non-Convex Optimization with $\tilde{\mkern1mu O}(ε^{-5/3})$ Global Rate

We introduce PF-AGD, the first parameter-free, deterministic, accelerated first-order method to achieve $O(ε^{-5/3}\log(1/ε))$ oracle complexity bound when minimizing sufficiently smooth, non-convex functions; this is the best-known bound for first-order methods on smooth non-convex objectives. Unlike existing methods possessing this rate that require a priori knowledge of smoothness constants, we use an adaptive backtracking scheme and a gradient-based restart mechanism to estimate local curvature. This yields a practical algorithm that matches best-known theoretical rates. Empirically, PF-AGD outperforms the practical variant of AGD-Until-Guilty (Carmon et al., 2017), as well as other parameter-free variants, and is a viable alternative to nonlinear conjugate gradient methods.

preprint2026arXiv

Efficient Techniques for Data Reconstruction, with Finite-Width Recovery Guarantees

Data reconstruction attacks on trained neural networks aim to recover the data on which the network has been trained and pose a significant threat to privacy, especially if the training dataset contains sensitive information. Here, we propose a unified optimization formulation of the data reconstruction problem based on initial and trained parameter values, incorporating state-of-the-art proposals. We show that in the random feature model, this formulation provably leads to training data reconstruction with high probability, provided the network width is sufficiently large; this unprecedented finite-width result uses PAC-style bounds. Furthermore, when the data lies in a low-dimensional subspace, we show that the network width requirement for successful reconstruction can be relaxed, with bounds depending on the subspace dimension rather than the ambient dimension. For general neural network models and unknown data orientations, we propose an efficient reconstruction algorithm that approximates the low-dimensional data subspace through the change in the first-layer weights during training and uses only the last-layer weights for reconstruction, thus reducing the search space dimension and the required network width for high-quality reconstructions. Our numerical experiments on synthetic datasets and CIFAR-10 confirm that our subspace-aware reconstruction approach outperforms standard full-space techniques.

preprint2026arXiv

Quadratic Objective Perturbation: Curvature-Based Differential Privacy

Objective perturbation is a standard mechanism in differentially private empirical risk minimization. In particular, Linear Objective Perturbation (LOP) enforces privacy by adding a random linear term, while strong convexity and stability are ensured by an additional deterministic quadratic term. However, this approach requires the strong assumption of bounded gradients of the loss function, which excludes many modern machine learning models. In this work, we introduce Quadratic Objective Perturbation (QOP), which perturbs the objective with a random quadratic form. This perturbation induces strong convexity and enforces stability of the problem through curvature, thereby enabling privacy and allowing sensitivity to be controlled through spectral properties of the perturbation rather than assumptions on the gradients. As a result, we obtain $(\varepsilon, δ)$-differential privacy under weaker assumptions, in the interpolation regime. Furthermore, we extend the analysis to account for approximate solutions, showing that privacy guarantees are preserved under inexact solves. Additionally, we derive utility guarantees in terms of empirical excess risk, and provide a theoretical and numerical comparison to LOP, highlighting the advantages of curvature-based perturbations. Finally, we discuss algorithmic aspects and show that the resulting problems can be solved efficiently using modern splitting schemes.

preprint2021arXiv

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

We introduce a general framework for large-scale model-based derivative-free optimization based on iterative minimization within random subspaces. We present a probabilistic worst-case complexity analysis for our method, where in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. This framework is specialized to nonlinear least-squares problems, with a model-based framework based on the Gauss-Newton method. This method achieves scalability by constructing local linear interpolation models to approximate the Jacobian, and computes new steps at each iteration in a subspace with user-determined dimension. We then describe a practical implementation of this framework, which we call DFBGN. We outline efficient techniques for selecting the interpolation points and search subspace, yielding an implementation that has a low per-iteration linear algebra cost (linear in the problem dimension) while also achieving fast objective decrease as measured by evaluations. Extensive numerical results demonstrate that DFBGN has improved scalability, yielding strong performance on large-scale nonlinear least-squares problems.

preprint2020arXiv

A dimensionality reduction technique for unconstrained global optimization of functions with low effective dimensionality

We investigate the unconstrained global optimization of functions with low effective dimensionality, that are constant along certain (unknown) linear subspaces. Extending the technique of random subspace embeddings in [Wang et al., Bayesian optimization in a billion dimensions via random embeddings. JAIR, 55(1): 361--387, 2016], we study a generic Random Embeddings for Global Optimization (REGO) framework that is compatible with any global minimization algorithm. Instead of the original, potentially large-scale optimization problem, within REGO, a Gaussian random, low-dimensional problem with bound constraints is formulated and solved in a reduced space. We provide novel probabilistic bounds for the success of REGO in solving the original, low effective-dimensionality problem, which show its independence of the (potentially large) ambient dimension and its precise dependence on the dimensions of the effective and randomly embedding subspaces. These results significantly improve existing theoretical analyses by providing the exact distribution of a reduced minimizer and its Euclidean norm and by the general assumptions required on the problem. We validate our theoretical findings by extensive numerical testing of REGO with three types of global optimization solvers, illustrating the improved scalability of REGO compared to the full-dimensional application of the respective solvers.

preprint2020arXiv

Adaptive regularization with cubics on manifolds

Adaptive regularization with cubics (ARC) is an algorithm for unconstrained, non-convex optimization. Akin to the popular trust-region method, its iterations can be thought of as approximate, safe-guarded Newton steps. For cost functions with Lipschitz continuous Hessian, ARC has optimal iteration complexity, in the sense that it produces an iterate with gradient smaller than $\varepsilon$ in $O(1/\varepsilon^{1.5})$ iterations. For the same price, it can also guarantee a Hessian with smallest eigenvalue larger than $-\varepsilon^{1/2}$. In this paper, we study a generalization of ARC to optimization on Riemannian manifolds. In particular, we generalize the iteration complexity results to this richer framework. Our central contribution lies in the identification of appropriate manifold-specific assumptions that allow us to secure these complexity guarantees both when using the exponential map and when using a general retraction. A substantial part of the paper is devoted to studying these assumptions---relevant beyond ARC---and providing user-friendly sufficient conditions for them. Numerical experiments are encouraging.

preprint2020arXiv

Constrained global optimization of functions with low effective dimensionality using multiple random embeddings

We consider the bound-constrained global optimization of functions with low effective dimensionality, that are constant along an (unknown) linear subspace and only vary over the effective (complement) subspace. We aim to implicitly explore the intrinsic low dimensionality of the constrained landscape using feasible random embeddings, in order to understand and improve the scalability of algorithms for the global optimization of these special-structure problems. A reduced subproblem formulation is investigated that solves the original problem over a random low-dimensional subspace subject to affine constraints, so as to preserve feasibility with respect to the given domain. Under reasonable assumptions, we show that the probability that the reduced problem is successful in solving the original, full-dimensional problem is positive. Furthermore, in the case when the objective's effective subspace is aligned with the coordinate axes, we provide an asymptotic bound on this success probability that captures its algebraic dependence on the effective and, surprisingly, ambient dimensions. We then propose X-REGO, a generic algorithmic framework that uses multiple random embeddings, solving the above reduced problem repeatedly, approximately and possibly, adaptively. Using the success probability of the reduced subproblems, we prove that X-REGO converges globally, with probability one, and linearly in the number of embeddings, to an $ε$-neighbourhood of a constrained global minimizer. Our numerical experiments on special structure functions illustrate our theoretical findings and the improved scalability of X-REGO variants when coupled with state-of-the-art global - and even local - optimization solvers for the subproblems.

preprint2020arXiv

Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems

Derivative-free - or zeroth-order - optimization (DFO) has gained recent attention for its ability to solve problems in a variety of application areas, including machine learning, particularly involving objectives which are stochastic and/or expensive to compute. In this work, we develop a novel model-based DFO method for solving nonlinear least-squares problems. We improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems.

preprint2020arXiv

Strong Evaluation Complexity Bounds for Arbitrary-Order Optimization of Nonconvex Nonsmooth Composite Functions

We introduce the concept of strong high-order approximate minimizers for nonconvex optimization problems. These apply in both standard smooth and composite non-smooth settings, and additionally allow convex or inexpensive constraints. An adaptive regularization algorithm is then proposed to find such approximate minimizers. Under suitable Lipschitz continuity assumptions, whenever the feasible set is convex, it is shown that using a model of degree $p$, this algorithm will find a strong approximate q-th-order minimizer in at most ${\cal O}\left(\max_{1\leq j\leq q}ε_j^{-(p+1)/(p-j+1)}\right)$ evaluations of the problem's functions and their derivatives, where $ε_j$ is the $j$-th order accuracy tolerance; this bound applies when either $q=1$ or the problem is not composite with $q \leq 2$. For general non-composite problems, even when the feasible set is nonconvex, the bound becomes ${\cal O}\left(\max_{1\leq j\leq q}ε_j^{-q(p+1)/p}\right)$ evaluations. If the problem is composite, and either $q > 1$ or the feasible set is not convex, the bound is then ${\cal O}\left(\max_{1\leq j\leq q}ε_j^{-(q+1)}\right)$ evaluations. These results not only provide, to our knowledge, the first known bound for (unconstrained or inexpensively-constrained) composite problems for optimality orders exceeding one, but also give the first sharp bounds for high-order strong approximate $q$-th order minimizers of standard (unconstrained and inexpensively constrained) smooth problems, thereby complementing known results for weak minimizers.

preprint2016arXiv

Quantitative recovery conditions for tree-based compressed sensing

As shown in [Blumensath and Davies 2009, Baraniuk et al. 2010], signals whose wavelet coefficients exhibit a rooted tree structure can be recovered using specially-adapted compressed sensing algorithms from just n=O(k) measurements, where k is the sparsity of the signal. Motivated by these results, we introduce a simplified proportional-dimensional asymptotic framework which enables the quantitative evaluation of recovery guarantees for tree-based compressed sensing. In the context of Gaussian matrices, we apply this framework to existing worst-case analysis of the Iterative Tree Projection (ITP) algorithm which makes use of the tree-based Restricted Isometry Property (RIP). Within the same framework, we then obtain quantitative results based on a new method of analysis, recently introduced in [Cartis and Thompson, 2015], which considers the fixed points of the algorithm. By exploiting the realistic average-case assumption that the measurements are statistically independent of the signal, we obtain significant quantitative improvements when compared to the tree-based RIP analysis. Our results have a refreshingly simple interpretation, explicitly determining a bound on the number of measurements that are required as a multiple of the sparsity. For example we prove that exact recovery of binary tree-based signals from noiseless Gaussian measurements is asymptotically guaranteed for ITP with constant stepsize provided n>50k. All our results extend to the more realistic case in which measurements are corrupted by noise.

preprint2015arXiv

Active-set prediction for interior point methods using controlled perturbations

We propose the use of controlled perturbations to address the challenging question of optimal active-set prediction for interior point methods. Namely, in the context of linear programming, we consider perturbing the inequality constraints/bounds so as to enlarge the feasible set. We show that if the perturbations are chosen appropriately, the solution of the original problem lies on or close to the central path of the perturbed problem. We also find that a primal-dual path-following algorithm applied to the perturbed problem is able to accurately predict the optimal active set of the original problem when the duality gap for the perturbed problem is not too small; furthermore, depending on problem conditioning, this prediction can happen sooner than predicting the active set for the perturbed problem or when the original one is solved. Encouraging preliminary numerical experience is reported when comparing activity prediction for the perturbed and unperturbed problem formulations.

preprint2014arXiv

A new and improved quantitative recovery analysis for iterative hard thresholding algorithms in compressed sensing

We present a new recovery analysis for a standard compressed sensing algorithm, Iterative Hard Thresholding (IHT) (Blumensath and Davies, 2008), which considers the fixed points of the algorithm. In the context of arbitrary measurement matrices, we derive a sufficient condition for convergence of IHT to a fixed point and a necessary condition for the existence of fixed points. These conditions allow us to perform a sparse signal recovery analysis in the deterministic noiseless case by implying that the original sparse signal is the unique fixed point and limit point of IHT, and in the case of Gaussian measurement matrices and noise by generating a bound on the approximation error of the IHT limit as a multiple of the noise level. By generalizing the notion of fixed points, we extend our analysis to the variable stepsize Normalised IHT (N-IHT) (Blumensath and Davies, 2010). For both stepsize schemes, we obtain lower bounds on asymptotic phase transitions in a proportional-dimensional framework, quantifying the sparsity/undersampling trade-off for which recovery is guaranteed. Exploiting the reasonable average-case assumption that the underlying signal and measurement matrix are independent, comparison with previous results within this framework shows a substantial quantitative improvement.

preprint2013arXiv

An exact tree projection algorithm for wavelets

We propose a dynamic programming algorithm for projection onto wavelet tree structures. In contrast to other recently proposed algorithms which only give approximate tree projections for a given sparsity, our algorithm is guaranteed to calculate the projection exactly. We also prove that our algorithm has O(Nk) complexity, where N is the signal dimension and k is the sparsity of the tree approximation.

preprint2010arXiv

Compressed Sensing: How sharp is the Restricted Isometry Property

Compressed Sensing (CS) seeks to recover an unknown vector with $N$ entries by making far fewer than $N$ measurements; it posits that the number of compressed sensing measurements should be comparable to the information content of the vector, not simply $N$. CS combines the important task of compression directly with the measurement task. Since its introduction in 2004 there have been hundreds of manuscripts on CS, a large fraction of which develop algorithms to recover a signal from its compressed measurements. Because of the paradoxical nature of CS -- exact reconstruction from seemingly undersampled measurements -- it is crucial for acceptance of an algorithm that rigorous analyses verify the degree of undersampling the algorithm permits. The Restricted Isometry Property (RIP) has become the dominant tool used for the analysis in such cases. We present here an asymmetric form of RIP which gives tighter bounds than the usual symmetric one. We give the best known bounds on the RIP constants for matrices from the Gaussian ensemble. Our derivations illustrate the way in which the combinatorial nature of CS is controlled. Our quantitative bounds on the RIP allow precise statements as to how aggressively a signal can be undersampled, the essential question for practitioners. We also document the extent to which RIP gives precise information about the true performance limits of CS, by comparing with approaches from high-dimensional geometry.

preprint2010arXiv

Phase Transitions for Greedy Sparse Approximation Algorithms

A major enterprise in compressed sensing and sparse approximation is the design and analysis of computationally tractable algorithms for recovering sparse, exact or approximate, solutions of underdetermined linear systems of equations. Many such algorithms have now been proven to have optimal-order uniform recovery guarantees using the ubiquitous Restricted Isometry Property (RIP). However, it is unclear when the RIP-based sufficient conditions on the algorithm are satisfied. We present a framework in which this task can be achieved; translating these conditions for Gaussian measurement matrices into requirements on the signal's sparsity level, length, and number of measurements. We illustrate this approach on three of the state-of-the-art greedy algorithms: CoSaMP, Subspace Pursuit (SP), and Iterative Hard Thresholding (IHT). Designed to allow a direct comparison of existing theory, our framework implies that, according to the best known bounds, IHT requires the fewest number of compressed sensing measurements and has the lowest per iteration computational cost of the three algorithms compared here.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint