Source author record

P. -A. Absil

P. -A. Absil appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

24works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A second-order method landing on the Stiefel manifold via Newton$\unicode{x2013}$Schulz iteration

Retraction-free approaches offer attractive low-cost alternatives to Riemannian methods on the Stiefel manifold, but they are often first-order, which may limit the efficiency under high-accuracy requirements. To this end, we propose a second-order method landing on the Stiefel manifold without invoking retractions, which is proved to enjoy local quadratic (or superlinear for its inexact variant) convergence. The update consists of the sum of (i) a component tangent to the level set of the constraint-defining function that aims to reduce the objective and (ii) a component normal to the same level set that reduces the infeasibility. Specifically, we construct the normal component via Newton$\unicode{x2013}$Schulz, a fixed-point iteration for orthogonalization. Moreover, we establish a geometric connection between the Newton$\unicode{x2013}$Schulz iteration and Stiefel manifolds, in which Newton$\unicode{x2013}$Schulz moves along the normal space. For the tangent component, we formulate a modified Newton equation that incorporates Newton$\unicode{x2013}$Schulz. Numerical experiments on the orthogonal Procrustes problem, principal component analysis, and real-data independent component analysis illustrate that the proposed method performs better than the existing methods.

preprint2023arXiv

A Grassmann Manifold Handbook: Basic Geometry and Computational Aspects

The Grassmann manifold of linear subspaces is important for the mathematical modelling of a multitude of applications, ranging from problems in machine learning, computer vision and image processing to low-rank matrix optimization problems, dynamic low-rank decompositions and model reduction. With this mostly expository work, we aim to provide a collection of the essential facts and formulae on the geometry of the Grassmann manifold in a fashion that is fit for tackling the aforementioned problems with matrix-based algorithms. Moreover, we expose the Grassmann geometry both from the approach of representing subspaces with orthogonal projectors and when viewed as a quotient space of the orthogonal group, where subspaces are identified as equivalence classes of (orthogonal) bases. This bridges the associated research tracks and allows for an easy transition between these two approaches. Original contributions include a modified algorithm for computing the Riemannian logarithm map on the Grassmannian that is advantageous numerically but also allows for a more elementary, yet more complete description of the cut locus and the conjugate points. We also derive a formula for parallel transport along geodesics in the orthogonal projector perspective, formulae for the derivative of the exponential map, as well as a formula for Jacobi fields vanishing at one point.

preprint2022arXiv

Comparison of an Apocalypse-Free and an Apocalypse-Prone First-Order Low-Rank Optimization Algorithm

We compare two first-order low-rank optimization algorithms, namely $\text{P}^2\text{GD}$ (Schneider and Uschmajew, 2015), which has been proven to be apocalypse-prone (Levin et al., 2021), and its apocalypse-free version $\text{P}^2\text{GDR}$ obtained by equipping $\text{P}^2\text{GD}$ with a suitable rank reduction mechanism (Olikier et al., 2022). Here an apocalypse refers to the situation where the stationarity measure goes to zero along a convergent sequence whereas it is nonzero at the limit. The comparison is conducted on two simple examples of apocalypses, the original one (Levin et al., 2021) and a new one. We also present a potential side effect of the rank reduction mechanism of $\text{P}^2\text{GDR}$ and discuss the choice of the rank reduction parameter.

preprint2022arXiv

Equivalent Polyadic Decompositions of Matrix Multiplication Tensors

Invariance transformations of polyadic decompositions of matrix multiplication tensors define an equivalence relation on the set of such decompositions. In this paper, we present an algorithm to efficiently decide whether two polyadic decompositions of a given matrix multiplication tensor are equivalent. With this algorithm, we analyze the equivalence classes of decompositions of several matrix multiplication tensors. This analysis is relevant for the study of fast matrix multiplication as it relates to the question of how many essentially different fast matrix multiplication algorithms there exist. This question has been first studied by de~Groote, who showed that for the multiplication of $2\times2$ matrices with $7$ active multiplications, all algorithms are essentially equivalent to Strassen's algorithm. In contrast, the results of our analysis show that for the multiplication of larger matrices, (e.g., $2\times3$ by $3\times2$ or $3\times3$ by $3\times3$ matrices), two decompositions are very likely to be essentially different. We further provide a necessary criterion for a polyadic decomposition to be equivalent to a polyadic decomposition with integer entries. Decompositions with specific integer entries, e.g., powers of two, provide fast matrix multiplication algorithms with better efficiency and stability properties. This condition can be tested algorithmically and we present the conclusions obtained for the decompositions of small/medium matrix multiplication tensors.

preprint2022arXiv

Optimization flows landing on the Stiefel manifold

We study a continuous-time system that solves optimization problems over the set of orthonormal matrices, which is also known as the Stiefel manifold. The resulting optimization flow follows a path that is not always on the manifold but asymptotically lands on the manifold. We introduce a generalized Stiefel manifold to which we extend the canonical metric of the Stiefel manifold. We show that the vector field of the proposed flow can be interpreted as the sum of a Riemannian gradient on a generalized Stiefel manifold and a normal vector. Moreover, we prove that the proposed flow globally converges to the set of critical points, and any local minimum and isolated critical point is asymptotically stable.

preprint2022arXiv

Projection onto quadratic hypersurfaces

We address the problem of projecting a point onto a quadratic hypersurface, more specifically a central quadric. We show how this problem reduces to finding a given root of a scalar-valued nonlinear function. We completely characterize one of the optimal solutions of the projection as either the unique root of this nonlinear function on a given interval, or as a point that belongs to a finite set of computable solutions. We then leverage this projection and the recent advancements in splitting methods to compute the projection onto the intersection of a box and a quadratic hypersurface with alternating projections and Douglas-Rachford splitting methods. We test these methods on a practical problem from the power systems literature, and show that they outperform IPOPT and Gurobi in terms of objective, execution time and feasibility of the solution.

preprint2021arXiv

A Riemannian rank-adaptive method for low-rank matrix completion

The low-rank matrix completion problem can be solved by Riemannian optimization on a fixed-rank manifold. However, a drawback of the known approaches is that the rank parameter has to be fixed a priori. In this paper, we consider the optimization problem on the set of bounded-rank matrices. We propose a Riemannian rank-adaptive method, which consists of fixed-rank optimization, rank increase step and rank reduction step. We explore its performance applied to the low-rank matrix completion problem. Numerical experiments on synthetic and real-world datasets illustrate that the proposed rank-adaptive method compares favorably with state-of-the-art algorithms. In addition, it shows that one can incorporate each aspect of this rank-adaptive framework separately into existing algorithms for the purpose of improving performance.

preprint2021arXiv

Symplectic eigenvalue problem via trace minimization and Riemannian optimization

We address the problem of computing the smallest symplectic eigenvalues and the corresponding eigenvectors of symmetric positive-definite matrices in the sense of Williamson's theorem. It is formulated as minimizing a trace cost function over the symplectic Stiefel manifold. We first investigate various theoretical aspects of this optimization problem such as characterizing the sets of critical points, saddle points, and global minimizers as well as proving that non-global local minimizers do not exist. Based on our recent results on constructing Riemannian structures on the symplectic Stiefel manifold and the associated optimization algorithms, we then propose solving the symplectic eigenvalue problem in the framework of Riemannian optimization. Moreover, a connection of the sought solution with the eigenvalues of a special class of Hamiltonian matrices is discussed. Numerical examples are presented.

preprint2020arXiv

Low-rank multi-parametric covariance identification

We propose a differential geometric construction for families of low-rank covariance matrices, via interpolation on low-rank matrix manifolds. In contrast with standard parametric covariance classes, these families offer significant flexibility for problem-specific tailoring via the choice of "anchor" matrices for the interpolation. Moreover, their low-rank facilitates computational tractability in high dimensions and with limited data. We employ these covariance families for both interpolation and identification, where the latter problem comprises selecting the most representative member of the covariance family given a data set. In this setting, standard procedures such as maximum likelihood estimation are nontrivial because the covariance family is rank-deficient; we resolve this issue by casting the identification problem as distance minimization. We demonstrate the power of these differential geometric families for interpolation and identification in a practical application: wind field covariance approximation for unmanned aerial vehicle navigation.

preprint2020arXiv

On a minimum enclosing ball of a collection of linear subspaces

This paper concerns the minimax center of a collection of linear subspaces. When the subspaces are $k$-dimensional subspaces of $\mathbb{R}^n$, this can be cast as finding the center of a minimum enclosing ball on a Grassmann manifold, Gr$(k,n)$. For subspaces of different dimension, the setting becomes a disjoint union of Grassmannians rather than a single manifold, and the problem is no longer well-defined. However, natural geometric maps exist between these manifolds with a well-defined notion of distance for the images of the subspaces under the mappings. Solving the initial problem in this context leads to a candidate minimax center on each of the constituent manifolds, but does not inherently provide intuition about which candidate is the best representation of the data. Additionally, the solutions of different rank are generally not nested so a deflationary approach will not suffice, and the problem must be solved independently on each manifold. We propose and solve an optimization problem parametrized by the rank of the minimax center. The solution is computed using a subgradient algorithm on the dual. By scaling the objective and penalizing the information lost by the rank-$k$ minimax center, we jointly recover an optimal dimension, $k^*$, and a central subspace, $U^* \in$ Gr$(k^*,n)$ at the center of the minimum enclosing ball, that best represents the data.

preprint2020arXiv

On the Quality of First-Order Approximation of Functions with Hölder Continuous Gradient

We show that Hölder continuity of the gradient is not only a sufficient condition, but also a necessary condition for the existence of a global upper bound on the error of the first-order Taylor approximation. We also relate this global upper bound to the Hölder constant of the gradient. This relation is expressed as an interval, depending on the Hölder constant, in which the error of the first-order Taylor approximation is guaranteed to be. We show that, for the Lipschitz continuous case, the interval cannot be reduced. An application to the norms of quadratic forms is proposed, which allows us to derive a novel characterization of Euclidean norms.

preprint2016arXiv

Low-rank plus sparse decomposition for exoplanet detection in direct-imaging ADI sequences. The LLSG algorithm

Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on beta Pictoris with VLT/NACO. Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. (abridged).

preprint2016arXiv

Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)

The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For this third edition, iTWIST'16 gathered about 50 international participants and features 8 invited talks, 12 oral presentations, and 12 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing (e.g., optics, computer vision, genomics, biomedical, digital communication, channel estimation, astronomy); Application of sparse models in non-convex/non-linear inverse problems (e.g., phase retrieval, blind deconvolution, self calibration); Approximate probabilistic inference for sparse problems; Sparse machine learning and inference; "Blind" inverse problems and dictionary learning; Optimization for sparse modelling; Information theory, geometry and randomness; Sparsity? What's next? (Discrete-valued signals; Union of low-dimensional spaces, Cosparsity, mixed/group norm, model-based, low-complexity models, ...); Matrix/manifold sensing/processing (graph, low-rank approximation, ...); Complexity/accuracy tradeoffs in numerical methods/optimization; Electronic/optical compressive sensors (hardware).

preprint2014arXiv

Mixed Integer Programming to Globally Minimize the Economic Load Dispatch Problem With Valve-Point Effect

Optimal distribution of power among generating units to meet a specific demand subject to system constraints is an ongoing research topic in the power system community. The problem, even in a static setting, turns out to be hard to solve with conventional optimization methods owing to the consideration of valve-point effects which make the cost function nonsmooth and nonconvex. This difficulty gave rise to the proliferation of population-based global heuristics in order to address the multi-extremal and nonsmooth problem. In this paper, we address the economic load dispatch problem (ELDP) with valve-point effect in its classic formulation where the cost function for each generator is expressed as the sum of a quadratic term and a rectified sine term. We propose two methods that resort to piecewise-quadratic surrogate cost functions, yielding surrogate problems that can be handled by mixed-integer quadratic programming (MIQP) solvers. The first method shows that the global solution of the ELDP can often be found by using a fixed and very limited number of quadratic pieces in the surrogate cost function. The second method adaptively builds piecewise-quadratic surrogate under-estimations of the ELDP cost function, yielding a sequence of surrogate MIQP problems. It is shown that any limit point of the sequence of MIQP solutions is a global solution of the ELDP. Moreover, numerical experiments indicate that the proposed methods outclass the state-of-the-art algorithms in terms of minimization value and computation time on practical instances.

preprint2014arXiv

Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.

preprint2014arXiv

Two Algorithms for Orthogonal Nonnegative Matrix Factorization with Application to Clustering

Approximate matrix factorization techniques with both nonnegativity and orthogonality constraints, referred to as orthogonal nonnegative matrix factorization (ONMF), have been recently introduced and shown to work remarkably well for clustering tasks such as document classification. In this paper, we introduce two new methods to solve ONMF. First, we show athematical equivalence between ONMF and a weighted variant of spherical k-means, from which we derive our first method, a simple EM-like algorithm. This also allows us to determine when ONMF should be preferred to k-means and spherical k-means. Our second method is based on an augmented Lagrangian approach. Standard ONMF algorithms typically enforce nonnegativity for their iterates while trying to achieve orthogonality at the limit (e.g., using a proper penalization term or a suitably chosen search direction). Our method works the opposite way: orthogonality is strictly imposed at each step while nonnegativity is asymptotically obtained, using a quadratic penalty. Finally, we show that the two proposed approaches compare favorably with standard ONMF algorithms on synthetic, text and image data sets.

preprint2013arXiv

Cramér-Rao bounds for synchronization of rotations

Synchronization of rotations is the problem of estimating a set of rotations R_i in SO(n), i = 1, ..., N, based on noisy measurements of relative rotations R_i R_j^T. This fundamental problem has found many recent applications, most importantly in structural biology. We provide a framework to study synchronization as estimation on Riemannian manifolds for arbitrary n under a large family of noise models. The noise models we address encompass zero-mean isotropic noise, and we develop tools for Gaussian-like as well as heavy-tail types of noise in particular. As a main contribution, we derive the Cramér-Rao bounds of synchronization, that is, lower-bounds on the variance of unbiased estimators. We find that these bounds are structured by the pseudoinverse of the measurement graph Laplacian, where edge weights are proportional to measurement quality. We leverage this to provide interpretation in terms of random walks and visualization tools for these bounds in both the anchored and anchor-free scenarios. Similar bounds previously established were limited to rotations in the plane and Gaussian-like noise.

preprint2013arXiv

Fast community detection using local neighbourhood search

Communities play a crucial role to describe and analyse modern networks. However, the size of those networks has grown tremendously with the increase of computational power and data storage. While various methods have been developed to extract community structures, their computational cost or the difficulty to parallelize existing algorithms make partitioning real networks into communities a challenging problem. In this paper, we propose to alter an efficient algorithm, the Louvain method, such that communities are defined as the connected components of a tree-like assignment graph. Within this framework, we precisely describe the different steps of our algorithm and demonstrate its highly parallelizable nature. We then show that despite its simplicity, our algorithm has a partitioning quality similar to the original method on benchmark graphs and even outperforms other algorithms. We also show that, even on a single processor, our method is much faster and allows the analysis of very large networks.

preprint2013arXiv

Manopt, a Matlab toolbox for optimization on manifolds

Optimization on manifolds is a rapidly developing branch of nonlinear optimization. Its focus is on problems where the smooth geometry of the search space can be leveraged to design efficient numerical algorithms. In particular, optimization on manifolds is well-suited to deal with rank and orthogonality constraints. Such structured constraints appear pervasively in machine learning applications, including low-rank matrix completion, sensor network localization, camera network registration, independent component analysis, metric learning, dimensionality reduction and so on. The Manopt toolbox, available at www.manopt.org, is a user-friendly, documented piece of software dedicated to simplify experimenting with state of the art Riemannian optimization algorithms. We aim particularly at reaching practitioners outside our field.

preprint2012arXiv

A nuclear-norm based convex formulation for informed source separation

We study the problem of separating audio sources from a single linear mixture. The goal is to find a decomposition of the single channel spectrogram into a sum of individual contributions associated to a certain number of sources. In this paper, we consider an informed source separation problem in which the input spectrogram is partly annotated. We propose a convex formulation that relies on a nuclear norm penalty to induce low rank for the contributions. We show experimentally that solving this model with a simple subgradient method outperforms a previously introduced nonnegative matrix factorization (NMF) technique, both in terms of source separation quality and computation time.

preprint2012arXiv

Two Newton methods on the manifold of fixed-rank matrices endowed with Riemannian quotient geometries

We consider two Riemannian geometries for the manifold $\mathcal{M}(p,m\times n)$ of all $m\times n$ matrices of rank $p$. The geometries are induced on $\mathcal{M}(p,m\times n)$ by viewing it as the base manifold of the submersion $π:(M,N)\mapsto MN^T$, selecting an adequate Riemannian metric on the total space, and turning $π$ into a Riemannian submersion. The theory of Riemannian submersions, an important tool in Riemannian geometry, makes it possible to obtain expressions for fundamental geometric objects on $\mathcal{M}(p,m\times n)$ and to formulate the Riemannian Newton methods on $\mathcal{M}(p,m\times n)$ induced by these two geometries. The Riemannian Newton methods admit a stronger and more streamlined convergence analysis than the Euclidean counterpart, and the computational overhead due to the Riemannian geometric machinery is shown to be mild. Potential applications include low-rank matrix completion and other low-rank matrix approximation problems.

preprint2008arXiv

A geometric Newton method for Oja's vector field

Newton's method for solving the matrix equation $F(X)\equiv AX-XX^TAX=0$ runs up against the fact that its zeros are not isolated. This is due to a symmetry of $F$ by the action of the orthogonal group. We show how differential-geometric techniques can be exploited to remove this symmetry and obtain a ``geometric'' Newton algorithm that finds the zeros of $F$. The geometric Newton method does not suffer from the degeneracy issue that stands in the way of the original Newton method.

preprint2008arXiv

Low-rank optimization for semidefinite convex problems

We propose an algorithm for solving nonlinear convex programs defined in terms of a symmetric positive semidefinite matrix variable $X$. This algorithm rests on the factorization $X=Y Y^T$, where the number of columns of Y fixes the rank of $X$. It is thus very effective for solving programs that have a low rank solution. The factorization $X=Y Y^T$ evokes a reformulation of the original problem as an optimization on a particular quotient manifold. The present paper discusses the geometry of that manifold and derives a second order optimization method. It furthermore provides some conditions on the rank of the factorization to ensure equivalence with the original problem. The efficiency of the proposed algorithm is illustrated on two applications: the maximal cut of a graph and the sparse principal component analysis problem.

preprint2008arXiv

Two-sided Grassmann-Rayleigh quotient iteration

The two-sided Rayleigh quotient iteration proposed by Ostrowski computes a pair of corresponding left-right eigenvectors of a matrix $C$. We propose a Grassmannian version of this iteration, i.e., its iterates are pairs of $p$-dimensional subspaces instead of one-dimensional subspaces in the classical case. The new iteration generically converges locally cubically to the pairs of left-right $p$-dimensional invariant subspaces of $C$. Moreover, Grassmannian versions of the Rayleigh quotient iteration are given for the generalized Hermitian eigenproblem, the Hamiltonian eigenproblem and the skew-Hamiltonian eigenproblem.

P. -A. Absil

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

A second-order method landing on the Stiefel manifold via Newton$\unicode{x2013}$Schulz iteration

A Grassmann Manifold Handbook: Basic Geometry and Computational Aspects

Comparison of an Apocalypse-Free and an Apocalypse-Prone First-Order Low-Rank Optimization Algorithm

Equivalent Polyadic Decompositions of Matrix Multiplication Tensors

Optimization flows landing on the Stiefel manifold

Projection onto quadratic hypersurfaces

A Riemannian rank-adaptive method for low-rank matrix completion

Symplectic eigenvalue problem via trace minimization and Riemannian optimization

Low-rank multi-parametric covariance identification

On a minimum enclosing ball of a collection of linear subspaces

On the Quality of First-Order Approximation of Functions with Hölder Continuous Gradient

Low-rank plus sparse decomposition for exoplanet detection in direct-imaging ADI sequences. The LLSG algorithm

Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)

Mixed Integer Programming to Globally Minimize the Economic Load Dispatch Problem With Valve-Point Effect

Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

Two Algorithms for Orthogonal Nonnegative Matrix Factorization with Application to Clustering

Cramér-Rao bounds for synchronization of rotations

Fast community detection using local neighbourhood search

Manopt, a Matlab toolbox for optimization on manifolds

A nuclear-norm based convex formulation for informed source separation

Two Newton methods on the manifold of fixed-rank matrices endowed with Riemannian quotient geometries

A geometric Newton method for Oja's vector field

Low-rank optimization for semidefinite convex problems

Two-sided Grassmann-Rayleigh quotient iteration