Researcher profile

Kimon Antonakopoulos

Kimon Antonakopoulos contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

In this work, we develop proximal preconditioned gradient methods with a focus on spectral gradient methods providing a proximal extension to the Muon and Scion optimizers. We introduce a family of stochastic algorithms that can handle a wide variety of convex and nonconvex constraints and study its convergence under heavy-tailed noise, through a novel analysis tailored to the geometry of the proposed methods. We further propose a variance-reduced version, which achieves faster convergence under standard noise assumptions. Finally, we show that the polynomial iterations used in Muon are more accurately captured by a nonlinear preconditioner than by the ideal matrix sign, leading to a convergence analysis that more faithfully reflects practical implementations.

preprint2022arXiv

A universal black-box optimization method with almost dimension-free convergence rate guarantees

Universal methods for optimization are designed to achieve theoretically optimal convergence rates without any prior knowledge of the problem's regularity parameters or the accurarcy of the gradient oracle employed by the optimizer. In this regard, existing state-of-the-art algorithms achieve an $\mathcal{O}(1/T^2)$ value convergence rate in Lipschitz smooth problems with a perfect gradient oracle, and an $\mathcal{O}(1/\sqrt{T})$ convergence rate when the underlying problem is non-smooth and/or the gradient oracle is stochastic. On the downside, these methods do not take into account the problem's dimensionality, and this can have a catastrophic impact on the achieved convergence rate, in both theory and practice. Our paper aims to bridge this gap by providing a scalable universal gradient method - dubbed UnderGrad - whose oracle complexity is almost dimension-free in problems with a favorable geometry (like the simplex, linearly constrained semidefinite programs and combinatorial bandits), while retaining the order-optimal dependence on $T$ described above. These "best-of-both-worlds" results are achieved via a primal-dual update scheme inspired by the dual exploration method for variational inequalities.

preprint2022arXiv

Routing in an Uncertain World: Adaptivity, Efficiency, and Equilibrium

We consider the traffic assignment problem in nonatomic routing games where the players' cost functions may be subject to random fluctuations (e.g., weather disturbances, perturbations in the underlying network, etc.). We tackle this problem from the viewpoint of a control interface that makes routing recommendations based solely on observed costs and without any further knowledge of the system's governing dynamics -- such as the network's cost functions, the distribution of any random events affecting the network, etc. In this online setting, learning methods based on the popular exponential weights algorithm converge to equilibrium at an $\mathcal{O}({1/\sqrt{T}})$ rate: this rate is known to be order-optimal in stochastic networks, but it is otherwise suboptimal in static networks. In the latter case, it is possible to achieve an $\mathcal{O}({1/T^{2}})$ equilibrium convergence rate via the use of finely tuned accelerated algorithms; on the other hand, these accelerated algorithms fail to converge altogether in the presence of persistent randomness, so it is not clear how to achieve the "best of both worlds" in terms of convergence speed. Our paper seeks to fill this gap by proposing an adaptive routing algortihm with the following desirable properties: $(i)$ it seamlessly interpolates between the $\mathcal{O}({1/T^{2}})$ and $\mathcal{O}({1/\sqrt{T}})$ rates for static and stochastic environments respectively; $(ii)$ its convergence speed is polylogarithmic in the number of paths in the network; ${(iii)}$ the method's per-iteration complexity and memory requirements are both linear in the number of nodes and edges in the network; and ${(iv)}$ it does not require any prior knowledge of the problem's parameters.