Researcher profile

Panagiotis Patrinos

Panagiotis Patrinos contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

In this work, we develop proximal preconditioned gradient methods with a focus on spectral gradient methods providing a proximal extension to the Muon and Scion optimizers. We introduce a family of stochastic algorithms that can handle a wide variety of convex and nonconvex constraints and study its convergence under heavy-tailed noise, through a novel analysis tailored to the geometry of the proposed methods. We further propose a variance-reduced version, which achieves faster convergence under standard noise assumptions. Finally, we show that the polynomial iterations used in Muon are more accurately captured by a nonlinear preconditioner than by the ideal matrix sign, leading to a convergence analysis that more faithfully reflects practical implementations.

preprint2026arXiv

Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems

We examine stability properties of primal-dual gradient flow dynamics for composite convex optimization problems with multiple, possibly nonsmooth, terms in the objective function under the generalized consensus constraint. The proposed dynamics are based on the proximal augmented Lagrangian and they provide a viable alternative to ADMM which faces significant challenges from both analysis and implementation viewpoints in large-scale multi-block scenarios. In contrast to customized algorithms with individualized convergence guarantees, we develop a systematic approach for solving a broad class of challenging composite optimization problems. We leverage various structural properties to establish global (exponential) convergence guarantees for the proposed dynamics. Our assumptions are much weaker than those required to prove (exponential) stability of primal-dual dynamics as well as (linear) convergence of discrete-time methods such as standard two-block and multi-block ADMM and EXTRA algorithms. Finally, we show necessity of some of our structural assumptions for exponential stability and provide computational experiments to demonstrate the convenience of the proposed approach for parallel and distributed computing applications.

preprint2023arXiv

A General Framework for Learning-Based Distributionally Robust MPC of Markov Jump Systems

We present a learning model predictive control (MPC) scheme for chance-constrained Markov jump systems with unknown switching probabilities. Using samples of the underlying Markov chain, ambiguity sets of transition probabilities are estimated which include the true conditional probability distributions with high probability. These sets are updated online and used to formulate a time-varying, risk-averse optimal control problem. We prove recursive feasibility of the resulting MPC scheme and show that the original chance constraints remain satisfied at every time step. Furthermore, we show that under sufficient decrease of the confidence levels, the resulting MPC scheme renders the closed-loop system mean-square stable with respect to the true-but-unknown distributions, while remaining less conservative than a fully robust approach. Finally, we show that the value function of the learning MPC converges from above to its nominal counterpart as the sample size grows to infinity. We illustrate our approach on a numerical example.

preprint2022arXiv

Tight convergence rates of the gradient method on smooth hypoconvex functions

We perform the first tight convergence analysis of the gradient method with varying step sizes when applied to smooth hypoconvex (weakly convex) functions. Hypoconvex functions are smooth nonconvex functions whose curvature is bounded and assumed to belong to the interval $[μ, L]$, with $μ<0$. Our convergence rates improve and extend the existing analysis for smooth nonconvex functions with $L$-Lipschitz gradient (which corresponds to the case $μ=-L$), and smoothly interpolates between that class and the class of smooth convex functions. We obtain our results using the performance estimation framework adapted to hypoconvex functions, for which new interpolation conditions are derived. We derive explicit upper bounds on the minimum gradient norm of the iterates for a large range of step sizes, explain why all such rates share a common structure, and prove that these rates are tight when step sizes are smaller or equal to $1/L$. Finally, we identify the optimal constant step size that minimizes the worst-case of the gradient method applied to hypoconvex functions.

preprint2021arXiv

Block Alternating Bregman Majorization Minimization with Extrapolation

In this paper, we consider a class of nonsmooth nonconvex optimization problems whose objective is the sum of a block relative smooth function and a proper and lower semicontinuous block separable function. Although the analysis of block proximal gradient (BPG) methods for the class of block $L$-smooth functions have been successfully extended to Bregman BPG methods that deal with the class of block relative smooth functions, accelerated Bregman BPG methods are scarce and challenging to design. Taking our inspiration from Nesterov-type acceleration and the majorization-minimization scheme, we propose a block alternating Bregman Majorization-Minimization framework with Extrapolation (BMME). We prove subsequential convergence of BMME to a first-order stationary point under mild assumptions, and study its global convergence under stronger conditions. We illustrate the effectiveness of BMME on the penalized orthogonal nonnegative matrix factorization problem.

preprint2021arXiv

Neural Network Training as an Optimal Control Problem: An Augmented Lagrangian Approach

Training of neural networks amounts to nonconvex optimization problems that are typically solved by using backpropagation and (variants of) stochastic gradient descent. In this work we propose an alternative approach by viewing the training task as a nonlinear optimal control problem. Under this lens, backpropagation amounts to the sequential approach (single shooting) to optimal control, where the states variables have been eliminated. It is well known that single shooting may lead to ill conditioning, and for this reason the simultaneous approach (multiple shooting) is typically preferred. Motivated by this hypothesis, an augmented Lagrangian algorithm is developed that only requires an approximate solution to the Lagrangian subproblems up to a user-defined accuracy. By applying this framework to the training of neural networks, it is shown that the inner Lagrangian subproblems are amenable to be solved using Gauss-Newton iterations. To fully exploit the structure of neural networks, the resulting linear least squares problems are addressed by employing an approach based on forward dynamic programming. Finally, the effectiveness of our method is showcased on regression datasets.

preprint2021arXiv

Unsupervised Energy-based Out-of-distribution Detection using Stiefel-Restricted Kernel Machine

Detecting out-of-distribution (OOD) samples is an essential requirement for the deployment of machine learning systems in the real world. Until now, research on energy-based OOD detectors has focused on the softmax confidence score from a pre-trained neural network classifier with access to class labels. In contrast, we propose an unsupervised energy-based OOD detector leveraging the Stiefel-Restricted Kernel Machine (St-RKM). Training requires minimizing an objective function with an autoencoder loss term and the RKM energy where the interconnection matrix lies on the Stiefel manifold. Further, we outline multiple energy function definitions based on the RKM framework and discuss their utility. In the experiments on standard datasets, the proposed method improves over the existing energy-based OOD detectors and deep generative models. Through several ablation studies, we further illustrate the merit of each proposed energy function on the OOD detection performance.

preprint2020arXiv

A block inertial Bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization

We propose BIBPA, a block inertial Bregman proximal algorithm for minimizing the sum of a block relatively smooth function (that is, relatively smooth concerning each block) and block separable nonsmooth nonconvex functions. We prove that the sequence generated by BIBPA subsequentially converges to critical points of the objective under standard assumptions, and globally converges when the objective function is additionally assumed to satisfy the Kurdyka-Łojasiewicz (KŁ) property. We also provide the convergence rate when the objective satisfies the Łojasiewicz inequality. We apply BIBPA to the symmetric nonnegative matrix tri-factorization (SymTriNMF) problem, where we propose kernel functions for SymTriNMF and provide closed-form solutions for subproblems of BIBPA.

preprint2020arXiv

Data-driven distributionally robust LQR with multiplicative noise

We present a data-driven method for solving the linear quadratic regulator problem for systems with multiplicative disturbances, the distribution of which is only known through sample estimates. We adopt a distributionally robust approach to cast the controller synthesis problem as semidefinite programs. Using results from high dimensional statistics, the proposed methodology ensures that their solution provides mean-square stabilizing controllers with high probability even for low sample sizes. As sample size increases the closed-loop cost approaches that of the optimal controller produced when the distribution is known. We demonstrate the practical applicability and performance of the method through a numerical experiment.

preprint2020arXiv

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization

We propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. Our methods possess three main advantages compared to current state-of-the-art accelerated first-order methods: (1) they allow using two different extrapolation points to evaluate the gradients and to add the inertial force (we will empirically show that it is more efficient than using a single extrapolation point), (2) they allow to randomly picking the block of variables to update, and (3) they do not require a restarting step. We prove the subsequential convergence of the generated sequence under mild assumptions, prove the global convergence under some additional assumptions, and provide convergence rates. We deploy the proposed methods to solve non-negative matrix factorization (NMF) and show that they compete favorably with the state-of-the-art NMF algorithms. Additional experiments on non-negative approximate canonical polyadic decomposition, also known as non-negative tensor factorization, are also provided.

preprint2020arXiv

Learning-Based Risk-Averse Model Predictive Control for Adaptive Cruise Control with Stochastic Driver Models

We propose a learning-based, distributionally robust model predictive control approach towards the design of adaptive cruise control (ACC) systems. We model the preceding vehicle as an autonomous stochastic system, using a hybrid model with continuous dynamics and discrete, Markovian inputs. We estimate the (unknown) transition probabilities of this model empirically using observed mode transitions and simultaneously determine sets of probability vectors (ambiguity sets) around these estimates, that contain the true transition probabilities with high confidence. We then solve a risk-averse optimal control problem that assumes the worst-case distributions in these sets. We furthermore derive a robust terminal constraint set and use it to establish recursive feasibility of the resulting MPC scheme. We validate the theoretical results and demonstrate desirable properties of the scheme through closed-loop simulations.

preprint2020arXiv

OpEn: Code Generation for Embedded Nonconvex Optimization

We present Optimization Engine (OpEn): an open-source code generation tool for real-time embedded nonconvex optimization, which implements a novel numerical method. OpEn combines the proximal averaged Newton-type method for optimal control (PANOC) with the penalty and augmented Lagrangian methods to compute approximate stationary points of nonconvex problems. The proposed method involves very simple algebraic operations such as vector products, has a low memory footprint and exhibits very good convergence properties that allow the solution of nonconvex problems on embedded devices. OpEn&#39;s core solver is written is Rust - a modern, high-performance, memory-safe and thread-safe systems programming language - while users can call it from Python, MATLAB, C, C++ or over a TCP socket.

preprint2020arXiv

Primal-dual algorithms for multi-agent structured optimization over message-passing architectures with bounded communication delays

We consider algorithms for solving structured convex optimization problems over a network of agents with communication delays. It is assumed that each agent performs its local updates by using possibly outdated information from its neighbors under the assumption that the delay with respect to each neighbor is bounded but otherwise arbitrary. The private objective of each agent is represented by the sum of two possibly nonsmooth functions, one of which is composed with a linear mapping. The global optimization problem is the aggregate of the local cost functions and a common Lipschitz-differentiable term. When the coupling between the agents is represented only through the common function the primal-dual algorithm proposed by Vũ and Condat can be conveniently employed, while for more general structures a new algorithm is proposed. Moreover, a randomized variant is presented that allows the agents to wake up at random and independently from one another. The convergence of each of the proposed algorithms is established under different strong convexity assumptions.

preprint2020arXiv

Proximal Gradient Algorithms: Applications in Signal Processing

Advances in numerical optimization have supported breakthroughs in several areas of signal processing. This paper focuses on the recent enhanced variants of the proximal gradient numerical optimization algorithm, which combine quasi-Newton methods with forward-adjoint oracles to tackle large-scale problems and reduce the computational burden of many applications. These proximal gradient algorithms are here described in an easy-to-understand way, illustrating how they are able to address a wide variety of problems arising in signal processing. A new high-level modeling language is presented which is used to demonstrate the versatility of the presented algorithms in a series of signal processing application examples such as sparse deconvolution, total variation denoising, audio de-clipping and others.

preprint2020arXiv

Sample Complexity of Data-Driven Stochastic LQR with Multiplicative Uncertainty

This paper studies the sample complexity of the stochastic Linear Quadratic Regulator when applied to systems with multiplicative noise. We assume that the covariance of the noise is unknown and estimate it using the sample covariance, which results in suboptimal behaviour. The main contribution of this paper is then to bound the suboptimality of the methodology and prove that it decreases with 1/N, where N denotes the amount of samples. The methodology easily generalizes to the case where the mean is unknown and to the distributionally robust case studied in a previous work of the authors. The analysis is mostly based on results from matrix function perturbation analysis.

preprint2019arXiv

QPALM: A Newton-type Proximal Augmented Lagrangian Method for Quadratic Programs

We present a proximal augmented Lagrangian based solver for general convex quadratic programs (QPs), relying on semismooth Newton iterations with exact line search to solve the inner subproblems. The exact line search reduces in this case to finding the zero of a one-dimensional monotone, piecewise affine function and can be carried out very efficiently. Our algorithm requires the solution of a linear system at every iteration, but as the matrix to be factorized depends on the active constraints, efficient sparse factorization updates can be employed like in active-set methods. Both primal and dual residuals can be enforced down to strict tolerances and otherwise infeasibility can be detected from intermediate iterates. A C implementation of the proposed algorithm is tested and benchmarked against other state-of-the-art QP solvers for a large variety of problem data and shown to compare favorably against these solvers.

preprint2018arXiv

Douglas-Rachford splitting and ADMM for nonconvex optimization: tight convergence results

Although originally designed and analyzed for convex problems, the alternating direction method of multipliers (ADMM) and its close relatives, Douglas-Rachford splitting (DRS) and Peaceman-Rachford splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex setting have only recently emerged. In this paper we show how the Douglas-Rachford envelope (DRE), introduced in 2014, can be employed to unify and considerably simplify the theory for devising global convergence guarantees for ADMM, DRS and PRS applied to nonconvex problems under less restrictive conditions, larger prox-stepsizes and over-relaxation parameters than previously known. In fact, our bounds are tight whenever the over-relaxation parameter ranges in $(0,2]$. The analysis of ADMM uses a universal primal equivalence with DRS that generalizes the known duality of the algorithms.

preprint2018arXiv

SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators

Operator splitting techniques have recently gained popularity in convex optimization problems arising in various control fields. Being fixed-point iterations of nonexpansive operators, such methods suffer many well known downsides, which include high sensitivity to ill conditioning and parameter selection, and consequent low accuracy and robustness. As universal solution we propose SuperMann, a Newton-type algorithm for finding fixed points of nonexpansive operators. It generalizes the classical Krasnosel&#39;skii-Mann scheme, enjoys its favorable global convergence properties and requires exactly the same oracle. It is based on a novel separating hyperplane projection tailored for nonexpansive mappings which makes it possible to include steps along any direction. In particular, when the directions satisfy a Dennis-Moré condition we show that SuperMann converges superlinearly under mild assumptions, which, surprisingly, do not entail nonsingularity of the Jacobian at the solution but merely metric subregularity. As a result, SuperMann enhances and robustifies all operator splitting schemes for structured convex optimization, overcoming their well known sensitivity to ill conditioning.