Source author record

Pavel Dvurechensky

Pavel Dvurechensky appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Computational Complexity Data Structures and Algorithms Machine Learning Computation Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

30works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Accelerated gradient methods with absolute and relative noise in the gradient

In this paper, we investigate accelerated first-order methods for smooth convex optimization problems under inexact information on the gradient of the objective. The noise in the gradient is considered to be additive with two possibilities: absolute noise bounded by a constant, and relative noise proportional to the norm of the gradient. We investigate the accumulation of the errors in the convex and strongly convex settings with the main difference with most of the previous works being that the feasible set can be unbounded. The key to the latter is to prove a bound on the trajectory of the algorithm. We also give a stopping criterion for the algorithm and consider extensions to the cases of stochastic optimization and composite nonsmooth problems.

preprint2022arXiv

An Approach for Non-Convex Uniformly Concave Structured Saddle Point Problem

Recently, saddle point problems have received much attention due to their powerful modeling capability for a lot of problems from diverse domains. Applications of these problems occur in many applied areas, such as robust optimization, distributed optimization, game theory, and many applications in machine learning such as empirical risk minimization and generative adversarial networks training. Therefore, many researchers have actively worked on developing numerical methods for solving saddle point problems in many different settings. This paper is devoted to developing a numerical method for solving saddle point problems in the non-convex uniformly-concave setting. We study a general class of saddle point problems with composite structure and Hölder-continuous higher-order derivatives. To solve the problem under consideration, we propose an approach in which we reduce the problem to a combination of two auxiliary optimization problems separately for each group of variables, outer minimization problem w.r.t. primal variables, and inner maximization problem w.r.t the dual variables. For solving the outer minimization problem, we use the \textit{Adaptive Gradient Method}, which is applicable for non-convex problems and also works with an inexact oracle that is generated by approximately solving the inner problem. For solving the inner maximization problem, we use the \textit{Restarted Unified Acceleration Framework}, which is a framework that unifies the high-order acceleration methods for minimizing a convex function that has Hölder-continuous higher-order derivatives. Separate complexity bounds are provided for the number of calls to the first-order oracles for the outer minimization problem and higher-order oracles for the inner maximization problem. Moreover, the complexity of the whole proposed approach is then estimated.

preprint2022arXiv

Decentralized convex optimization under affine constraints for power systems control

Modern power systems are now in continuous process of massive changes. Increased penetration of distributed generation, usage of energy storage and controllable demand require introduction of a new control paradigm that does not rely on massive information exchange required by centralized approaches. Distributed algorithms can rely only on limited information from neighbours to obtain an optimal solution for various optimization problems, such as optimal power flow, unit commitment etc. As a generalization of these problems we consider the problem of decentralized minimization of the smooth and convex partially separable function $f = \sum_{k=1}^l f^k(x^k,\tilde x)$ under the coupled $\sum_{k=1}^l (A^k x^k - b^k) \leq 0$ and the shared $\tilde{A} \tilde{x} - \tilde{b} \leq 0$ affine constraints, where the information about $A^k$ and $b^k$ is only available for the $k$-th node of the computational network. One way to handle the coupled constraints in a distributed manner is to rewrite them in a distributed-friendly form using the Laplace matrix of the communication graph and auxiliary variables (Khamisov, CDC, 2017). Instead of using this method we reformulate the constrained optimization problem as a saddle point problem (SPP) and utilize the consensus constraint technique to make it distributed-friendly. Then we provide a complexity analysis for state-of-the-art SPP solving algorithms applied to this SPP.

preprint2022arXiv

Generalized Mirror Prox for Monotone Variational Inequalities: Universality and Inexact Oracle

We introduce an inexact oracle model for variational inequalities (VI) with monotone operator, propose a numerical method which solves such VI's and analyze its convergence rate. As a particular case, we consider VI's with Hölder-continuous operator and show that our algorithm is universal. This means that without knowing the Hölder parameter $ν$ and Hölder constant $L_ν$ it has the best possible complexity for this class of VI's, namely our algorithm has complexity $O\left( \inf_{ν\in[0,1]}\left(\frac{L_ν}{\varepsilon} \right)^{\frac{2}{1+ν}}R^2 \right)$, where $R$ is the size of the feasible set and $\varepsilon$ is the desired accuracy of the solution. We also consider the case of VI's with strongly monotone operator and generalize our method for VI's with inexact oracle and our universal method for this class of problems. Finally, we show, how our method can be applied to convex-concave saddle point problems with Hölder-continuous partial subgradients.

preprint2022arXiv

Oracle Complexity Separation in Convex Optimization

Many convex optimization problems have structured objective function written as a sum of functions with different types of oracles (full gradient, coordinate derivative, stochastic gradient) and different evaluation complexity of these oracles. In the strongly convex case these functions also have different condition numbers, which eventually define the iteration complexity of first-order methods and the number of oracle calls required to achieve given accuracy. Motivated by the desire to call more expensive oracle less number of times, in this paper we consider minimization of a sum of two functions and propose a generic algorithmic framework to separate oracle complexities for each component in the sum. As a specific example, for the $μ$-strongly convex problem $\min_{x\in \mathbb{R}^n} h(x) + g(x)$ with $L_h$-smooth function $h$ and $L_g$-smooth function $g$, a special case of our algorithm requires, up to a logarithmic factor, $O(\sqrt{L_h/μ})$ first-order oracle calls for $h$ and $O(\sqrt{L_g/μ})$ first-order oracle calls for $g$. Our general framework covers also the setting of strongly convex objectives, the setting when $g$ is given by coordinate derivative oracle, and the setting when $g$ has a finite-sum structure and is available through stochastic gradient oracle. In the latter two cases we obtain respectively accelerated random coordinate descent and accelerated variance reduction methods with oracle complexity separation.

preprint2021arXiv

First-Order Methods for Convex Optimization

First-order methods for solving convex optimization problems have been at the forefront of mathematical optimization in the last 20 years. The rapid development of this important class of algorithms is motivated by the success stories reported in various applications, including most importantly machine learning, signal processing, imaging and control theory. First-order methods have the potential to provide low accuracy solutions at low computational complexity which makes them an attractive set of tools in large-scale optimization problems. In this survey we cover a number of key developments in gradient-based optimization methods. This includes non-Euclidean extensions of the classical proximal gradient method, and its accelerated versions. Additionally we survey recent developments within the class of projection-free methods, and proximal versions of primal-dual schemes. We give complete proofs for various key results, and highlight the unifying aspects of several optimization algorithms.

preprint2021arXiv

Numerical methods for the resource allocation problem in networks

In this paper, we consider the resource allocation problem in a network with a large number of connections which are used by a huge number of users. The resource allocation problem under discussion is a maximization problem with linear inequality constraints. To solve this problem we construct the dual problem and propose to use the following numerical optimization methods for the dual: a fast gradient method, a stochastic projected subgradient method, an ellipsoid method, and a random gradient extrapolation method. A special focus is made on the primal-dual analysis of these methods. For each method we estimate the convergence rate. We also provide some modifications of these methods in the setup of distributed computations, taking into account their application to networks.

preprint2021arXiv

Zeroth-order methods for noisy Hölder-gradient functions

In this paper, we prove new complexity bounds for zeroth-order methods in non-convex optimization with inexact observations of the objective function values. We use the Gaussian smoothing approach of Nesterov and Spokoiny [2015] and extend their results, obtained for optimization methods for smooth zeroth-order non-convex problems, to the setting of minimization of functions with Hölder-continuous gradient with noisy zeroth-order oracle, obtaining noise upper-bounds as well. We consider finite-difference gradient approximation based on normally distributed random Gaussian vectors and prove that gradient descent scheme based on this approximation converges to the stationary point of the smoothed function. We also consider convergence to the stationary point of the original (not smoothed) function and obtain bounds on the number of steps of the algorithm for making the norm of its gradient small. Additionally, we provide bounds for the level of noise in the zeroth-order oracle for which it is still possible to guarantee that the above bounds hold. We also consider separately the case of $ν= 1$ and show that in this case the dependence of the obtained bounds on the dimension can be improved.

preprint2020arXiv

Adaptive Gradient Descent for Convex and Non-Convex Stochastic Optimization

In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated and non-accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.

preprint2020arXiv

Alternating Minimization Methods for Strongly Convex Optimization

{We consider alternating minimization procedures for convex optimization problems with variable divided in many block, each block being amenable for minimization with respect to its variable with freezed other variables blocks. In the case of two blocks, we prove a linear convergence rate for alternating minimization procedure under Polyak-Lojasiewicz condition, which can be seen as a relaxation of the strong convexity assumption. Under strong convexity assumption in many-blocks setting we provide an accelerated alternating minimization procedure with linear rate depending on the square root of the condition number as opposed to condition number for the non-accelerated method. We also mention an approximating non-negative solution to a linear system of equations $Ax=y$ with alternating minimization of Kullback-Leibler (KL) divergence between $Ax$ and $y$.

preprint2020arXiv

An Accelerated Directional Derivative Method for Smooth Stochastic Convex Optimization

We consider smooth stochastic convex optimization problems in the context of algorithms which are based on directional derivatives of the objective function. This context can be considered as an intermediate one between derivative-free optimization and gradient-based optimization. We assume that at any given point and for any given direction, a stochastic approximation for the directional derivative of the objective function at this point and in this direction is available with some additive noise. The noise is assumed to be of an unknown nature, but bounded in the absolute value. We underline that we consider directional derivatives in any direction, as opposed to coordinate descent methods which use only derivatives in coordinate directions. For this setting, we propose a non-accelerated and an accelerated directional derivative method and provide their complexity bounds. Our non-accelerated algorithm has a complexity bound which is similar to the gradient-based algorithm, that is, without any dimension-dependent factor. Our accelerated algorithm has a complexity bound which coincides with the complexity bound of the accelerated gradient-based algorithm up to a factor of square root of the problem dimension. We extend these results to strongly convex problems.

preprint2020arXiv

An Accelerated Method for Derivative-Free Smooth Stochastic Convex Optimization

We consider an unconstrained problem of minimizing a smooth convex function which is only available through noisy observations of its values, the noise consisting of two parts. Similar to stochastic optimization problems, the first part is of stochastic nature. The second part is additive noise of unknown nature, but bounded in absolute value. In the two-point feedback setting, i.e. when pairs of function values are available, we propose an accelerated derivative-free algorithm together with its complexity analysis. The complexity bound of our derivative-free algorithm is only by a factor of $\sqrt{n}$ larger than the bound for accelerated gradient-based algorithms, where $n$ is the dimension of the decision variable. We also propose a non-accelerated derivative-free algorithm with a complexity bound similar to the stochastic-gradient-based algorithm, that is, our bound does not have any dimension-dependent factor except logarithmic. Notably, if the difference between the starting point and the solution is a sparse vector, for both our algorithms, we obtain a better complexity bound if the algorithm uses an $1$-norm proximal setup, rather than the Euclidean proximal setup, which is a standard choice for unconstrained problems

preprint2020arXiv

Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters

We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. We assume there is a network of agents/machines/computers, and each agent holds a private continuous probability measure and seeks to compute the barycenter of all the measures in the network by getting samples from its local measure and exchanging information with its neighbors. Motivated by this problem, we develop, and analyze, a novel accelerated primal-dual stochastic gradient method for general stochastic convex optimization problems with linear equality constraints. Then, we apply this method to the decentralized distributed optimization setting to obtain a new algorithm for the distributed semi-discrete regularized Wasserstein barycenter problem. Moreover, we show explicit non-asymptotic complexity for the proposed algorithm.

preprint2020arXiv

Inexact Model: A Framework for Optimization and Variational Inequalities

In this paper we propose a general algorithmic framework for first-order methods in optimization in a broad sense, including minimization problems, saddle-point problems and variational inequalities. This framework allows to obtain many known methods as a special case, the list including accelerated gradient method, composite optimization methods, level-set methods, proximal methods. The idea of the framework is based on constructing an inexact model of the main problem component, i.e. objective function in optimization or operator in variational inequalities. Besides reproducing known results, our framework allows to construct new methods, which we illustrate by constructing a universal method for variational inequalities with composite structure. This method works for smooth and non-smooth problems with optimal complexity without a priori knowledge of the problem smoothness. We also generalize our framework for strongly convex objectives and strongly monotone variational inequalities.

preprint2020arXiv

Multimarginal Optimal Transport by Accelerated Alternating Minimization

We consider a multimarginal optimal transport, which includes as a particular case the Wasserstein barycenter problem. In this problem one has to find an optimal coupling between $m$ probability measures, which amounts to finding a tensor of the order $m$. We propose an accelerated method based on accelerated alternating minimization and estimate its complexity to find the approximate solution to the problem. We use entropic regularization with sufficiently small regularization parameter and apply accelerated alternating minimization to the dual problem. A novel primal-dual analysis is used to reconstruct the approximately optimal coupling tensor. Our algorithm exhibits a better computational complexity than the state-of-the-art methods for some regimes of the problem parameters.

preprint2020arXiv

Numerical methods in large-scale optimization: inexact oracle and primal-dual analysis

This is a short summary of already published results on accelerated first and zero-order optimization methods, as well as accelerated methods for problems with linear constraints. This short summary is a requirement for obtaining a degree of doctor of sciences in Russian Federation. The details can be found in the papers listed in the introduction.

preprint2020arXiv

On the Complexity of Approximating Wasserstein Barycenter

We study the complexity of approximating Wassertein barycenter of $m$ discrete measures, or histograms of size $n$ by contrasting two alternative approaches, both using entropic regularization. The first approach is based on the Iterative Bregman Projections (IBP) algorithm for which our novel analysis gives a complexity bound proportional to $\frac{mn^2}{\varepsilon^2}$ to approximate the original non-regularized barycenter. Using an alternative accelerated-gradient-descent-based approach, we obtain a complexity proportional to $\frac{mn^{2.5}}{\varepsilon} $. As a byproduct, we show that the regularization parameter in both approaches has to be proportional to $\varepsilon$, which causes instability of both algorithms when the desired accuracy is high. To overcome this issue, we propose a novel proximal-IBP algorithm, which can be seen as a proximal gradient method, which uses IBP on each iteration to make a proximal step. We also consider the question of scalability of these algorithms using approaches from distributed optimization and show that the first algorithm can be implemented in a centralized distributed setting (master/slave), while the second one is amenable to a more general decentralized distributed setting with an arbitrary network topology.

preprint2020arXiv

On the Optimal Combination of Tensor Optimization Methods

We consider the minimization problem of a sum of a number of functions having Lipshitz $p$-th order derivatives with different Lipschitz constants. In this case, to accelerate optimization, we propose a general framework allowing to obtain near-optimal oracle complexity for each function in the sum separately, meaning, in particular, that the oracle for a function with lower Lipschitz constant is called a smaller number of times. As a building block, we extend the current theory of tensor methods and show how to generalize near-optimal tensor methods to work with inexact tensor step. Further, we investigate the situation when the functions in the sum have Lipschitz derivatives of a different order. For this situation, we propose a generic way to separate the oracle complexity between the parts of the sum. Our method is not optimal, which leads to an open problem of the optimal combination of oracles of a different order.

preprint2020arXiv

Self-Concordant Analysis of Frank-Wolfe Algorithms

Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations. If the problem admits a stronger local linear minimization oracle, we construct a novel FW method with linear convergence rate for SC functions.

preprint2019arXiv

The global rate of convergence for optimal tensor methods in smooth convex optimization

We consider convex optimization problems with the objective function having Lipshitz-continuous $p$-th order derivative, where $p\geq 1$. We propose a new tensor method, which closes the gap between the lower $O\left(\varepsilon^{-\frac{2}{3p+1}} \right)$ and upper $O\left(\varepsilon^{-\frac{1}{p+1}} \right)$ iteration complexity bounds for this class of optimization problems. We also consider uniformly convex functions, and show how the proposed method can be accelerated under this additional assumption. Moreover, we introduce a $p$-th order condition number which naturally arises in the complexity analysis of tensor methods under this assumption. Finally, we make a numerical study of the proposed optimal method and show that in practice it is faster than the best known accelerated tensor method. We also compare the performance of tensor methods for $p=2$ and $p=3$ and show that the 3rd-order method is superior to the 2nd-order method in practice.

preprint2016arXiv

Efficient calculation of stochastic equilibriums in the Beckmann's and stable dynamic models

We propose composite approache to the special sum-type convex optimization problem with affine restriction and special entropy type regularization. Since the fuctional has a penalty type form, we reformulate initial conditional optimization problem in a special unconstrained form that allows us to put the penalty type functional into the composite term. We also describe the characteristic functions on graphs technique (Yu. Nesterov, 2007) in application to the dual problem.

preprint2016arXiv

Efficient numerical algorithms for regularized regression problem with applications to traffic matrix estimations

In this work we collect and compare to each other many different numerical methods for regularized regression problem and for the problem of projection on a hyperplane. Such problems arise, for example, as a subproblem of demand matrix estimation in IP- networks. In this special case matrix of affine constraints has special structure: all elements are 0 or 1 and this matrix is sparse enough. We have to deal with huge-scale convex optimization problem of special type. Using the properties of the problem we try "to look inside the black-box" and to see how the best modern methods work being applied to this problem.

preprint2016arXiv

Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints

In this paper we consider a class of optimization problems with a strongly convex objective function and the feasible set given by an intersection of a simple convex set with a set given by a number of linear equality and inequality constraints. A number of optimization problems in applications can be stated in this form, examples being the entropy-linear programming, the ridge regression, the elastic net, the regularized optimal transport, etc. We extend the Fast Gradient Method applied to the dual problem in order to make it primal-dual so that it allows not only to solve the dual problem, but also to construct nearly optimal and nearly feasible solution of the primal problem. We also prove a theorem about the convergence rate for the proposed algorithm in terms of the objective function and the linear constraints infeasibility.

preprint2016arXiv

Gradient and gradient-free methods for stochastic convex optimization with inexact oracle

In the paper we generalize universal gradient method (Yu. Nesterov) to strongly convex case and to Intermediate gradient method (Devolder-Glineur-Nesterov). We also consider possible generalizations to stochastic and online context. We show how these results can be generalized to gradient-free method and method of random direction search. But the main ingridient of this paper is assumption about the oracle. We considered the oracle to be inexact.

preprint2016arXiv

Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods

In this paper, we consider a non-convex loss-minimization problem of learning Supervised PageRank models, which can account for some properties not considered by classical approaches such as the classical PageRank model. We propose gradient-based and random gradient-free methods to solve this problem. Our algorithms are based on the concept of an inexact oracle and unlike the state state-of-the-art gradient-based method we manage to provide theoretically the convergence rate guarantees for both of them. In particular, under the assumption of local convexity of the loss function, our random gradient-free algorithm guarantees decrease of the loss function value expectation. At the same time, we theoretically justify that without convexity assumption for the loss function our gradient-based algorithm allows to find a point where the stationary condition is fulfilled with a given accuracy. For both proposed optimization algorithms, we find the settings of hyperparameters which give the lowest complexity (i.e., the number of arithmetic operations needed to achieve the given accuracy of the solution of the loss-minimization problem). The resulting estimates of the complexity are also provided. Finally, we apply proposed optimization algorithms to the web page ranking problem and compare proposed and state-of-the-art algorithms in terms of the considered loss function.

preprint2016arXiv

Primal-Dual Method for Searching Equilibrium in Hierarchical Congestion Population Games

In this paper, we consider a large class of hierarchical congestion population games. One can show that the equilibrium in a game of such type can be described as a minimum point in a properly constructed multi-level convex optimization problem. We propose a fast primal-dual composite gradient method and apply it to the problem, which is dual to the problem describing the equilibrium in the considered class of games. We prove that this method allows to find an approximate solution of the initial problem without increasing the complexity.

preprint2016arXiv

Searching equillibriums in Beckmann's and Nesterov--de Palma's models

In this paper we propose and develop classical Frank--Wolf algorithm for Beckmann's type models. This is not new, but we investigate details that allows us to speed up. We also consider stable dynamic like models. First model of this type was proposed 15 years ago by Yu. Nesterov and A. DePalma. We propose randomized dual averaging method with special (sum-type) randomization. For both of the problems we obtain the rates of convergences. It seems that this estimations to be unimprovable without additional assumption about problem formulation.

preprint2016arXiv

Universal method with inexact oracle and its applications for searching equillibriums in multistage transport problems

In this paper we propose a new efficient approach for numerical calculation of equillibriums in multistage transport problems. In the very core of our approach lies the proper combination of Universal Gradient Method proposed by Yu. Nesterov (2013) and conception of inexact oracle (Devolder--Glineur--Nesterov, 2011). In particular our technique allows us to calculate Wasserstein's Barycenter in a fast manner (this results generalized M. Cuturi et al. (2014)).

preprint2015arXiv

Learning Supervised PageRank with Gradient-Free Optimization Methods

In this paper, we consider a problem of learning supervised PageRank models, which can account for some properties not considered by classical approaches such as the classical PageRank algorithm. Due to huge hidden dimension of the optimization problem we use random gradient-free methods to solve it. We prove a convergence theorem and estimate the number of arithmetic operations needed to solve it with a given accuracy. We find the best settings of the gradient-free optimization method in terms of the number of arithmetic operations needed to achieve given accuracy of the objective. In the paper, we apply our algorithm to the web page ranking problem. We consider a parametric graph model of users' behavior and evaluate web pages' relevance to queries by our algorithm. The experiments show that our optimization method outperforms the untuned gradient-free method in the ranking quality.

preprint2015arXiv

Stochastic Intermediate Gradient Method for Convex Problems with Inexact Stochastic Oracle

In this paper we introduce new methods for convex optimization problems with inexact stochastic oracle. First method is an extension of the intermediate gradient method proposed by Devolder, Glineur and Nesterov for problems with inexact oracle. Our new method can be applied to the problems with composite structure, stochastic inexact oracle and allows using non-Euclidean setup. We prove estimates for mean rate of convergence and probabilities of large deviations from this rate. Also we introduce two modifications of this method for strongly convex problems. For the first modification we prove mean rate of convergence estimates and for the second we prove estimates for large deviations from the mean rate of convergence. All the rates give the complexity estimates for proposed methods which up to multiplicative constant coincide with lower complexity bound for the considered class of convex composite optimization problems with stochastic inexact oracle.

Pavel Dvurechensky

What is connected

Connect this record

See the researcher in context

Building this map preview

30 published item(s)

Accelerated gradient methods with absolute and relative noise in the gradient

An Approach for Non-Convex Uniformly Concave Structured Saddle Point Problem

Decentralized convex optimization under affine constraints for power systems control

Generalized Mirror Prox for Monotone Variational Inequalities: Universality and Inexact Oracle

Oracle Complexity Separation in Convex Optimization

First-Order Methods for Convex Optimization

Numerical methods for the resource allocation problem in networks

Zeroth-order methods for noisy Hölder-gradient functions

Adaptive Gradient Descent for Convex and Non-Convex Stochastic Optimization

Alternating Minimization Methods for Strongly Convex Optimization

An Accelerated Directional Derivative Method for Smooth Stochastic Convex Optimization

An Accelerated Method for Derivative-Free Smooth Stochastic Convex Optimization

Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters

Inexact Model: A Framework for Optimization and Variational Inequalities

Multimarginal Optimal Transport by Accelerated Alternating Minimization

Numerical methods in large-scale optimization: inexact oracle and primal-dual analysis

On the Complexity of Approximating Wasserstein Barycenter

On the Optimal Combination of Tensor Optimization Methods

Self-Concordant Analysis of Frank-Wolfe Algorithms

The global rate of convergence for optimal tensor methods in smooth convex optimization

Efficient calculation of stochastic equilibriums in the Beckmann's and stable dynamic models

Efficient numerical algorithms for regularized regression problem with applications to traffic matrix estimations

Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints

Gradient and gradient-free methods for stochastic convex optimization with inexact oracle

Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods

Primal-Dual Method for Searching Equilibrium in Hierarchical Congestion Population Games

Searching equillibriums in Beckmann's and Nesterov--de Palma's models

Universal method with inexact oracle and its applications for searching equillibriums in multistage transport problems

Learning Supervised PageRank with Gradient-Free Optimization Methods

Stochastic Intermediate Gradient Method for Convex Problems with Inexact Stochastic Oracle