Source author record

Yura Malitsky

Yura Malitsky appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Distributed, Parallel, and Cluster Computing math.NA Numerical Analysis

Catalog footprint

What is connected

11works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A First-Order Algorithm for Decentralised Min-Max Problems

In this work, we consider a connected network of finitely many agents working cooperatively to solve a min-max problem with convex-concave structure. We propose a decentralised first-order algorithm which can be viewed as a non-trivial combination of two algorithms: PG-EXTRA for decentralised minimisation problems and the forward reflected backward method for (non-distributed) min-max problems. In each iteration of our algorithm, each agent computes the gradient of the smooth component of its local objective function as well as the proximal operator of its nonsmooth component, following by a round of communication with its neighbours. Our analysis shows that the sequence generated by the method converges under standard assumptions with non-decaying stepsize.

preprint2022arXiv

Distributed Forward-Backward Methods for Ring Networks

In this work, we propose and analyse forward-backward-type algorithms for finding a zero of the sum of finitely many monotone operators, which are not based on reduction to a two operator inclusion in the product space. Each iteration of the studied algorithms requires one resolvent evaluation per set-valued operator, one forward evaluation per cocoercive operator, and two forward evaluations per monotone operator. Unlike existing methods, the structure of the proposed algorithms are suitable for distributed, decentralised implementation in ring networks without needing global summation to enforce consensus between nodes.

preprint2022arXiv

Resolvent Splitting for Sums of Monotone Operators with Minimal Lifting

In this work, we study fixed point algorithms for finding a zero in the sum of $n\geq 2$ maximally monotone operators by using their resolvents. More precisely, we consider the class of such algorithms where each resolvent is evaluated only once per iteration. For any algorithm from this class, we show that the underlying fixed point operator is necessarily defined on a $d$-fold Cartesian product space with $d\geq n-1$. Further, we show that this bound is unimprovable by providing a family of examples for which $d=n-1$ is attained. This family includes the Douglas-Rachford algorithm as the special case when $n=2$. Applications of the new family of algorithms in distributed decentralised optimisation and multi-block extensions of the alternation direction method of multipliers (ADMM) are discussed.

preprint2022arXiv

Stochastic Variance Reduction for Variational Inequality Methods

We propose stochastic variance reduced algorithms for solving convex-concave saddle point problems, monotone variational inequalities, and monotone inclusions. Our framework applies to extragradient, forward-backward-forward, and forward-reflected-backward methods both in Euclidean and Bregman setups. All proposed methods converge in the same setting as their deterministic counterparts and they either match or improve the best-known complexities for solving structured min-max problems. Our results reinforce the correspondence between variance reduction in variational inequalities and minimization. We also illustrate the improvements of our approach with numerical evaluations on matrix games.

preprint2020arXiv

A Forward-Backward Splitting Method for Monotone Inclusions Without Cocoercivity

In this work, we propose a simple modification of the forward-backward splitting method for finding a zero in the sum of two monotone operators. Our method converges under the same assumptions as Tseng's forward-backward-forward method, namely, it does not require cocoercivity of the single-valued operator. Moreover, each iteration only requires one forward evaluation rather than two as is the case for Tseng's method. Variants of the method incorporating a linesearch, relaxation and inertia, or a structured three operator inclusion are also discussed.

preprint2020arXiv

A new regret analysis for Adam-type algorithms

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $β_{1}$ (typically between $0.9$ and $0.99$). In theory, regret guarantees for online convex optimization require a rapidly decaying $β_{1}\to0$ schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $β_{1}$, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

preprint2020arXiv

Adaptive Gradient Descent without Descent

We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don't increase the stepsize too fast and 2) don't overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on the smoothness in a neighborhood of a solution. Given that the problem is convex, our method converges even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice-differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including logistic regression and matrix factorization.

preprint2020arXiv

Convergence of adaptive algorithms for weakly convex constrained optimization

We analyze the adaptive first order algorithm AMSGrad, for solving a constrained stochastic optimization problem with a weakly convex objective. We prove the $\mathcal{\tilde O}(t^{-1/4})$ rate of convergence for the norm of the gradient of Moreau envelope, which is the standard stationarity measure for this class of problems. It matches the known rates that adaptive algorithms enjoy for the specific case of unconstrained smooth stochastic optimization. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly unbounded optimization domains. Finally, we illustrate the applications and extensions of our results to specific problems and algorithms.

preprint2020arXiv

Revisiting Stochastic Extragradient

We fix a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates. Since the existing stochastic extragradient algorithm, called Mirror-Prox, of (Juditsky et al., 2011) diverges on a simple bilinear problem when the domain is not bounded, we prove guarantees for solving variational inequality that go beyond existing settings. Furthermore, we illustrate numerically that the proposed variant converges faster than many other methods on bilinear saddle-point problems. We also discuss how extragradient can be applied to training Generative Adversarial Networks (GANs) and how it compares to other methods. Our experiments on GANs demonstrate that the introduced approach may make the training faster in terms of data passes, while its higher iteration complexity makes the advantage smaller.

preprint2019arXiv

Shadow Douglas--Rachford Splitting for Monotone Inclusions

In this work, we propose a new algorithm for finding a zero in the sum of two monotone operators where one is assumed to be single-valued and Lipschitz continuous. This algorithm naturally arises from a non-standard discretization of a continuous dynamical system associated with the Douglas--Rachford splitting algorithm. More precisely, it is obtained by performing an explicit, rather than implicit, discretization with respect to one of the operators involved. Each iteration of the proposed algorithm requires the evaluation of one forward and one backward operator.

preprint2018arXiv

Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints

We consider the problem of minimizing a convex, separable, nonsmooth function subject to linear constraints. The numerical method we propose is a block-coordinate extension of the Chambolle-Pock primal-dual algorithm. We prove convergence of the method without resorting to assumptions like smoothness or strong convexity of the objective, full-rank condition on the matrix, strong duality or even consistency of the linear system. Freedom from imposing the latter assumption permits convergence guarantees for misspecified or noisy systems.

Yura Malitsky

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A First-Order Algorithm for Decentralised Min-Max Problems

Distributed Forward-Backward Methods for Ring Networks

Resolvent Splitting for Sums of Monotone Operators with Minimal Lifting

Stochastic Variance Reduction for Variational Inequality Methods

A Forward-Backward Splitting Method for Monotone Inclusions Without Cocoercivity

A new regret analysis for Adam-type algorithms

Adaptive Gradient Descent without Descent

Convergence of adaptive algorithms for weakly convex constrained optimization

Revisiting Stochastic Extragradient

Shadow Douglas--Rachford Splitting for Monotone Inclusions

Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints