Source author record

Darina Dvinskikh

Darina Dvinskikh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

13works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

On the relations of stochastic convex optimization problems with empirical risk minimization problems on $p$-norm balls

In this paper, we consider convex stochastic optimization problems arising in machine learning applications (e.g., risk minimization) and mathematical statistics (e.g., maximum likelihood estimation). There are two main approaches to solve such kinds of problems, namely the Stochastic Approximation approach (online approach) and the Sample Average Approximation approach, also known as the Monte Carlo approach, (offline approach). In the offline approach, the problem is replaced by its empirical counterpart (the empirical risk minimization problem). The natural question is how to define the problem sample size, i.e., how many realizations should be sampled so that the quite accurate solution of the empirical problem be the solution of the original problem with the desired precision. This issue is one of the main issues in modern machine learning and optimization. In the last decade, a lot of significant advances were made in these areas to solve convex stochastic optimization problems on the Euclidean balls (or the whole space). In this work, we are based on these advances and study the case of arbitrary balls in the $\ell_p$-norms. We also explore the question of how the parameter $p$ affects the estimates of the required number of terms as a function of empirical risk.

preprint2022arXiv

Oracle Complexity Separation in Convex Optimization

Many convex optimization problems have structured objective function written as a sum of functions with different types of oracles (full gradient, coordinate derivative, stochastic gradient) and different evaluation complexity of these oracles. In the strongly convex case these functions also have different condition numbers, which eventually define the iteration complexity of first-order methods and the number of oracle calls required to achieve given accuracy. Motivated by the desire to call more expensive oracle less number of times, in this paper we consider minimization of a sum of two functions and propose a generic algorithmic framework to separate oracle complexities for each component in the sum. As a specific example, for the $μ$-strongly convex problem $\min_{x\in \mathbb{R}^n} h(x) + g(x)$ with $L_h$-smooth function $h$ and $L_g$-smooth function $g$, a special case of our algorithm requires, up to a logarithmic factor, $O(\sqrt{L_h/μ})$ first-order oracle calls for $h$ and $O(\sqrt{L_g/μ})$ first-order oracle calls for $g$. Our general framework covers also the setting of strongly convex objectives, the setting when $g$ is given by coordinate derivative oracle, and the setting when $g$ has a finite-sum structure and is available through stochastic gradient oracle. In the latter two cases we obtain respectively accelerated random coordinate descent and accelerated variance reduction methods with oracle complexity separation.

preprint2021arXiv

Decentralized and Parallel Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems

We introduce primal and dual stochastic gradient oracle methods for decentralized convex optimization problems. Both for primal and dual oracles, the proposed methods are optimal in terms of the number of communication steps. However, for all classes of the objective, the optimality in terms of the number of oracle calls per node takes place only up to a logarithmic factor and the notion of smoothness. By using mini-batching technique, we show that the proposed methods with stochastic oracle can be additionally parallelized at each node. The considered algorithms can be applied to many data science problems and inverse problems.

preprint2021arXiv

Improved Complexity Bounds in Wasserstein Barycenter Problem

In this paper, we focus on computational aspects of the Wasserstein barycenter problem. We propose two algorithms to compute Wasserstein barycenters of $m$ discrete measures of size $n$ with accuracy $\e$. The first algorithm, based on mirror prox with a specific norm, meets the complexity of celebrated accelerated iterative Bregman projections (IBP), namely $\widetilde O(mn^2\sqrt n/\e)$, however, with no limitations in contrast to the (accelerated) IBP, which is numerically unstable under small regularization parameter. The second algorithm, based on area-convexity and dual extrapolation, improves the previously best-known convergence rates for the Wasserstein barycenter problem enjoying $\widetilde O(mn^2/\e)$ complexity.

preprint2021arXiv

Recent theoretical advances in decentralized distributed convex optimization

In the last few years, the theory of decentralized distributed convex optimization has made significant progress. The lower bounds on communications rounds and oracle calls have appeared, as well as methods that reach both of these bounds. In this paper, we focus on how these results can be explained based on optimal algorithms for the non-distributed setup. In particular, we provide our recent results that have not been published yet and that could be found in details only in arXiv preprints.

preprint2020arXiv

Accelerated and nonaccelerated stochastic gradient descent with inexact model

In this paper, we propose a new way to obtain optimal convergence rates for smooth stochastic (strong) convex optimization tasks. Our approach is based on results for optimization tasks where gradients have nonrandom noise. In contrast to previously known results, we extend our idea to the inexact model conception.

preprint2020arXiv

Accelerated and nonaccelerated stochastic gradient descent with model conception

In this paper, we describe a new way to get convergence rates for optimal methods in smooth (strongly) convex optimization tasks. Our approach is based on results for tasks where gradients have nonrandom small noises. Unlike previous results, we obtain convergence rates with model conception.

preprint2020arXiv

Accelerated gradient sliding and variance reduction

We consider sum-type strongly convex optimization problem (first term) with smooth convex not proximal friendly composite (second term). We show that the complexity of this problem can be split into optimal number of incremental oracle calls for the first (sum-type) term and optimal number of oracle calls for the second (composite) term. Here under `optimal number' we mean estimate that corresponds to the well known lower bound in the absence of another term.

preprint2020arXiv

Accelerated methods for composite non-bilinear saddle point problem

Based on G. Lan's accelerated gradient sliding and general relation between the smoothness and strong convexity parameters of function under Legendre transformation we show that under rather general conditions the best known bounds for bilinear convex-concave smooth composite saddle point problem keep true for or non-bilinear convex-concave smooth composite saddle point problem. Moreover, we describe situations when the bounds differ and explain the nature of the difference.

preprint2020arXiv

Adaptive Gradient Descent for Convex and Non-Convex Stochastic Optimization

In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated and non-accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.

preprint2020arXiv

Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters

We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. We assume there is a network of agents/machines/computers, and each agent holds a private continuous probability measure and seeks to compute the barycenter of all the measures in the network by getting samples from its local measure and exchanging information with its neighbors. Motivated by this problem, we develop, and analyze, a novel accelerated primal-dual stochastic gradient method for general stochastic convex optimization problems with linear equality constraints. Then, we apply this method to the decentralized distributed optimization setting to obtain a new algorithm for the distributed semi-discrete regularized Wasserstein barycenter problem. Moreover, we show explicit non-asymptotic complexity for the proposed algorithm.

preprint2020arXiv

Inexact Model: A Framework for Optimization and Variational Inequalities

In this paper we propose a general algorithmic framework for first-order methods in optimization in a broad sense, including minimization problems, saddle-point problems and variational inequalities. This framework allows to obtain many known methods as a special case, the list including accelerated gradient method, composite optimization methods, level-set methods, proximal methods. The idea of the framework is based on constructing an inexact model of the main problem component, i.e. objective function in optimization or operator in variational inequalities. Besides reproducing known results, our framework allows to construct new methods, which we illustrate by constructing a universal method for variational inequalities with composite structure. This method works for smooth and non-smooth problems with optimal complexity without a priori knowledge of the problem smoothness. We also generalize our framework for strongly convex objectives and strongly monotone variational inequalities.

preprint2020arXiv

On the Complexity of Approximating Wasserstein Barycenter

We study the complexity of approximating Wassertein barycenter of $m$ discrete measures, or histograms of size $n$ by contrasting two alternative approaches, both using entropic regularization. The first approach is based on the Iterative Bregman Projections (IBP) algorithm for which our novel analysis gives a complexity bound proportional to $\frac{mn^2}{\varepsilon^2}$ to approximate the original non-regularized barycenter. Using an alternative accelerated-gradient-descent-based approach, we obtain a complexity proportional to $\frac{mn^{2.5}}{\varepsilon} $. As a byproduct, we show that the regularization parameter in both approaches has to be proportional to $\varepsilon$, which causes instability of both algorithms when the desired accuracy is high. To overcome this issue, we propose a novel proximal-IBP algorithm, which can be seen as a proximal gradient method, which uses IBP on each iteration to make a proximal step. We also consider the question of scalability of these algorithms using approaches from distributed optimization and show that the first algorithm can be implemented in a centralized distributed setting (master/slave), while the second one is amenable to a more general decentralized distributed setting with an arbitrary network topology.

Darina Dvinskikh

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

On the relations of stochastic convex optimization problems with empirical risk minimization problems on $p$-norm balls

Oracle Complexity Separation in Convex Optimization

Decentralized and Parallel Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems

Improved Complexity Bounds in Wasserstein Barycenter Problem

Recent theoretical advances in decentralized distributed convex optimization

Accelerated and nonaccelerated stochastic gradient descent with inexact model

Accelerated and nonaccelerated stochastic gradient descent with model conception

Accelerated gradient sliding and variance reduction

Accelerated methods for composite non-bilinear saddle point problem

Adaptive Gradient Descent for Convex and Non-Convex Stochastic Optimization

Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters

Inexact Model: A Framework for Optimization and Variational Inequalities

On the Complexity of Approximating Wasserstein Barycenter