Researcher profile

Gesualdo Scutari

Gesualdo Scutari contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

A New Decomposition Paradigm for Graph-structured Nonlinear Programs via Message Passing

We study finite-sum nonlinear programs with localized variable coupling encoded by a (hyper)graph. We introduce a graph-compliant decomposition framework that brings message passing into continuous optimization in a rigorous, implementable, and provable way. The (hyper)graph is partitioned into tree clusters (hypertree factor graphs). At each iteration, agents update in parallel by solving local subproblems whose objective splits into an {\it intra}-cluster term summarized by cost-to-go messages from one min-sum sweep on the cluster tree, and an {\it inter}-cluster coupling term handled Jacobi-style using the latest out-of-cluster variables. To reduce computation/communication, the method supports graph-compliant surrogates that replace exact messages/local solves with compact low-dimensional parametrizations; in hypergraphs, the same principle enables surrogate hyperedge splitting, to tame heavy hyperedge overlaps while retaining finite-time intra-cluster message updates and efficient computation/communication. We establish convergence for (strongly) convex and nonconvex objectives, with topology- and partition-explicit rates that quantify curvature/coupling effects and guide clustering and scalability. To our knowledge, this is the first convergent message-passing method on loopy graphs.

preprint2022arXiv

Acceleration in Distributed Optimization under Similarity

We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. The loss functions of the agents are assumed to be \textit{similar}, due to statistical data similarity or otherwise. In order to reduce the number of communications to reach a solution accuracy, we proposed a {\it preconditioned, accelerated} distributed method. An $\varepsilon$-solution is achieved in $\tilde{\mathcal{O}}\big(\sqrt{\frac{β/μ}{1-ρ}}\log1/\varepsilon\big)$ number of communications steps, where $β/μ$ is the relative condition number between the global and local loss functions, and $ρ$ characterizes the connectivity of the network. This rate matches (up to poly-log factors) lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest. Numerical results show significant communication savings with respect to existing accelerated distributed schemes, especially when solving ill-conditioned problems.

preprint2022arXiv

Distributed Saddle-Point Problems Under Similarity

We study solution methods for (strongly-)convex-(strongly)-concave Saddle-Point Problems (SPPs) over networks of two type - master/workers (thus centralized) architectures and meshed (thus decentralized) networks. The local functions at each node are assumed to be similar, due to statistical data similarity or otherwise. We establish lower complexity bounds for a fairly general class of algorithms solving the SPP. We show that a given suboptimality $ε>0$ is achieved over master/workers networks in $Ω\big(Δ\cdot δ/μ\cdot \log (1/\varepsilon)\big)$ rounds of communications, where $δ>0$ measures the degree of similarity of the local functions, $μ$ is their strong convexity constant, and $Δ$ is the diameter of the network. The lower communication complexity bound over meshed networks reads $Ω\big(1/{\sqrtρ} \cdot δ/μ\cdot\log (1/\varepsilon)\big)$, where $ρ$ is the (normalized) eigengap of the gossip matrix used for the communication between neighbouring nodes. We then propose algorithms matching the lower bounds over either types of networks (up to log-factors). We assess the effectiveness of the proposed algorithms on a robust logistic regression problem.

preprint2022arXiv

Finite-Bit Quantization For Distributed Algorithms With Linear Convergence

This paper studies distributed algorithms for (strongly convex) composite optimization problems over mesh networks, subject to quantized communications. Instead of focusing on a specific algorithmic design, a black-box model is proposed, casting linearly convergent distributed algorithms in the form of fixed-point iterates. The algorithmic model is equipped with a novel random or deterministic Biased Compression (BC) rule on the quantizer design, and a new Adaptive encoding Nonuniform Quantizer (ANQ) coupled with a communication-efficient encoding scheme, which implements the BC-rule using a finite number of bits (below machine precision). This fills a gap existing in most state-of-the-art quantization schemes, such as those based on the popular compression rule, which rely on communication of some scalar signals with negligible quantization error (in practice quantized at the machine precision). A unified communication complexity analysis is developed for the black-box model, determining the average number of bits required to reach a solution of the optimization problem within a target accuracy. It is shown that the proposed BC-rule preserves linear convergence of the unquantized algorithms, and a trade-off between convergence rate and communication cost under ANQ-based quantization is characterized. Numerical results validate our theoretical findings and show that distributed algorithms equipped with the proposed ANQ have more favorable communication cost than algorithms using state-of-the-art quantization rules.

preprint2022arXiv

Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity

We study structured convex optimization problems, with additive objective $r:=p + q$, where $r$ is ($μ$-strongly) convex, $q$ is $L_q$-smooth and convex, and $p$ is $L_p$-smooth, possibly nonconvex. For such a class of problems, we proposed an inexact accelerated gradient sliding method that can skip the gradient computation for one of these components while still achieving optimal complexity of gradient calls of $p$ and $q$, that is, $\mathcal{O}(\sqrt{L_p/μ})$ and $\mathcal{O}(\sqrt{L_q/μ})$, respectively. This result is much sharper than the classic black-box complexity $\mathcal{O}(\sqrt{(L_p+L_q)/μ})$, especially when the difference between $L_q$ and $L_q$ is large. We then apply the proposed method to solve distributed optimization problems over master-worker architectures, under agents' function similarity, due to statistical data similarity or otherwise. The distributed algorithm achieves for the first time lower complexity bounds on {\it both} communication and local gradient calls, with the former having being a long-standing open problem. Finally the method is extended to distributed saddle-problems (under function similarity) by means of solving a class of variational inequalities, achieving lower communication and computation complexity bounds.

preprint2020arXiv

Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks

This paper proposes a novel family of primal-dual-based distributed algorithms for smooth, convex, multi-agent optimization over networks that uses only gradient information and gossip communications. The algorithms can also employ acceleration on the computation and communications. We provide a unified analysis of their convergence rate, measured in terms of the Bregman distance associated to the saddle point reformation of the distributed optimization problem. When acceleration is employed, the rate is shown to be optimal, in the sense that it matches (under the proposed metric) existing complexity lower bounds of distributed algorithms applicable to such a class of problem and using only gradient information and gossip communications. Preliminary numerical results on distributed least-square regression problems show that the proposed algorithm compares favorably on existing distributed schemes.

preprint2020arXiv

Asynchronous Decentralized Successive Convex Approximation

We study decentralized asynchronous multiagent optimization over networks, modeled as static (possibly directed) graphs. The optimization problem consists of minimizing a (possibly nonconvex) smooth function--the sum of the agents' local costs--plus a convex (possibly nonsmooth) regularizer, subject to convex constraints. Agents can perform their local computations as well as communicate with their immediate neighbors at any time, without any form of coordination or centralized scheduling; furthermore, when solving their local subproblems, they can use outdated information from their neighbors. We propose the first distributed asynchronous algorithm, termed ASY-DSCA, that converges at an R-linear rate to the optimal solution of convex problems whose objective function satisfies a general error bound condition; this condition is weaker than the more frequently used strong convexity, and it is satisfied by several empirical risk functions that are not strongly convex; examples include LASSO and logistic regression problems. When the objective function is nonconvex, ASY-DSCA converges to a stationary solution of the problem at a sublinear rate.

preprint2020arXiv

Finite rate distributed weight-balancing and average consensus over digraphs

This paper proposes the first distributed algorithm that solves the weight-balancing problem using only finite rate and simplex communications among nodes, compliant with the directed nature of the graph edges. It is proved that the algorithm converges to a weight-balanced solution at sublinear rate. The analysis builds upon a new metric inspired by positional system representations, which characterizes the dynamics of information exchange over the network, and on a novel step-size rule. Building on this result, a novel distributed algorithm is proposed that solves the average consensus problem over digraphs, using, at each timeslot, finite rate simplex communications between adjacent nodes -- some bits for the weight-balancing problem and others for the average consensus. Convergence of the proposed quantized consensus algorithm to the average of the node's unquantized initial values is established, both almost surely and in the moment generating function of the error; and a sublinear convergence rate is proved for sufficiently large step-sizes. Numerical results validate our theoretical findings.

preprint2020arXiv

Ghost Penalties in Nonconvex Constrained Optimization: Diminishing Stepsizes and Iteration Complexity

We consider nonconvex constrained optimization problems and propose a new approach to the convergence analysis based on penalty functions. We make use of classical penalty functions in an unconventional way, in that penalty functions only enter in the theoretical analysis of convergence while the algorithm itself is penalty-free. Based on this idea, we are able to establish several new results, including the first general analysis for diminishing stepsize methods in nonconvex, constrained optimization, showing convergence to generalized stationary points, and a complexity study for SQP-type algorithms.

preprint2020arXiv

Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime Measures

A set of about 80 researchers, practitioners, and federal agency program managers participated in the NSF-sponsored Grand Challenges in Resilience Workshop held on Purdue campus on March 19-21, 2019. The workshop was divided into three themes: resilience in cyber, cyber-physical, and socio-technical systems. About 30 attendees in all participated in the discussions of cyber resilience. This article brings out the substantive parts of the challenges and solution approaches that were identified in the cyber resilience theme. In this article, we put forward the substantial challenges in cyber resilience in a few representative application domains and outline foundational solutions to address these challenges. These solutions fall into two broad themes: resilience-by-design and resilience-by-reaction. We use examples of autonomous systems as the application drivers motivating cyber resilience. We focus on some autonomous systems in the near horizon (autonomous ground and aerial vehicles) and also a little more distant (autonomous rescue and relief). For resilience-by-design, we focus on design methods in software that are needed for our cyber systems to be resilient. In contrast, for resilience-by-reaction, we discuss how to make systems resilient by responding, reconfiguring, or recovering at runtime when failures happen. We also discuss the notion of adaptive execution to improve resilience, execution transparently and adaptively among available execution platforms (mobile/embedded, edge, and cloud). For each of the two themes, we survey the current state, and the desired state and ways to get there. We conclude the paper by looking at the research challenges we will have to solve in the short and the mid-term to make the vision of resilient autonomous systems a reality.

preprint2020arXiv

Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case

This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstream approaches, the proposed methodology uses no training data, employs no graph Laplacian matrix to penalize the optimization task, uses no costly (kernel) pre-imaging step to map feature points back to the input space, and utilizes complex-valued kernel functions to account for k-space data. The framework is validated on synthetically generated dMRI data, where comparisons against state-of-the-art schemes highlight the rich potential of the proposed approach in data-recovery problems.

preprint2020arXiv

Second-order Guarantees of Distributed Gradient Algorithms

We consider distributed smooth nonconvex unconstrained optimization over networks, modeled as a connected graph. We examine the behavior of distributed gradient-based algorithms near strict saddle points. Specifically, we establish that (i) the renowned Distributed Gradient Descent (DGD) algorithm likely converges to a neighborhood of a Second-order Stationary (SoS) solution; and (ii) the more recent class of distributed algorithms based on gradient tracking--implementable also over digraphs--likely converges to exact SoS solutions, thus avoiding (strict) saddle-points. Furthermore, new convergence rate results to first-order critical points is established for the latter class of algorithms.

preprint2019arXiv

Bi-Linear Modeling of Data Manifolds for Dynamic-MRI Recovery

This paper puts forth a novel bi-linear modeling framework for data recovery via manifold-learning and sparse-approximation arguments and considers its application to dynamic magnetic-resonance imaging (dMRI). Each temporal-domain MR image is viewed as a point that lies onto or close to a smooth manifold, and landmark points are identified to describe the point cloud concisely. To facilitate computations, a dimensionality reduction module generates low-dimensional/compressed renditions of the landmark points. Recovery of the high-fidelity MRI data is realized by solving a non-convex minimization task for the linear decompression operator and those affine combinations of landmark points which locally approximate the latent manifold geometry. An algorithm with guaranteed convergence to stationary solutions of the non-convex minimization task is also provided. The aforementioned framework exploits the underlying spatio-temporal patterns and geometry of the acquired data without any prior training on external data or information. Extensive numerical results on simulated as well as real cardiac-cine and perfusion MRI data illustrate noteworthy improvements of the advocated machine-learning framework over state-of-the-art reconstruction techniques.