Source author record

Yuyuan Ouyang

Yuyuan Ouyang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC math.ST Statistics Theory

Catalog footprint

What is connected

8works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Graph topology invariant gradient and sampling complexity for decentralized and stochastic optimization

One fundamental problem in decentralized multi-agent optimization is the trade-off between gradient/sampling complexity and communication complexity. We propose new algorithms whose gradient and sampling complexities are graph topology invariant while their communication complexities remain optimal. For convex smooth deterministic problems, we propose a primal dual sliding (PDS) algorithm that computes an $ε$-solution with $O((\tilde{L}/ε)^{1/2})$ gradient and $O((\tilde{L}/ε)^{1/2}+\|\mathcal{A}\|/ε)$ communication complexities, where $\tilde{L}$ is the smoothness parameter of the objective and $\mathcal{A}$ is related to either the graph Laplacian or the transpose of the oriented incidence matrix of the communication network. The results can be improved to $O((\tilde{L}/μ)^{1/2}\log(1/ε))$ and $O((\tilde{L}/μ)^{1/2}\log(1/ε) + \|\mathcal{A}\|/ε^{1/2})$ respectively with $μ$-strong convexity. We also propose a stochastic variant, the primal dual sliding (SPDS) algorithm for problems with stochastic gradients. The SPDS algorithm utilizes the mini-batch technique and enables the agents to perform sampling and communication simultaneously. It computes a stochastic $ε$-solution with $O((\tilde{L}/ε)^{1/2} + (σ/ε)^2)$ sampling complexity, which can be improved to $O((\tilde{L}/μ)^{1/2}\log(1/ε) + σ^2/ε)$ with strong convexity. Here $σ^2$ is the variance. The communication complexities of SPDS remain the same as that of the deterministic case. All the aforementioned gradient and sampling complexities match the lower complexity bounds for centralized convex smooth optimization and are independent of the network structure. To the best of our knowledge, these gradient and sampling complexities have not been obtained before for decentralized optimization over a constraint feasible set.

preprint2020arXiv

An Asymptotic Result of Conditional Logistic Regression Estimator

In cluster-specific studies, ordinary logistic regression and conditional logistic regression for binary outcomes provide maximum likelihood estimator (MLE) and conditional maximum likelihood estimator (CMLE), respectively. In this paper, we show that CMLE is approaching to MLE asymptotically when each individual data point is replicated infinitely many times. Our theoretical derivation is based on the observation that a term appearing in the conditional average log-likelihood function is the coefficient of a polynomial, and hence can be transformed to a complex integral by Cauchy's differentiation formula. The asymptotic analysis of the complex integral can then be performed using the classical method of steepest descent. Our result implies that CMLE can be biased if individual weights are multiplied with a constant, and that we should be cautious when assigning weights to cluster-specific studies.

preprint2020arXiv

Backtracking linesearch for conditional gradient sliding

We present a modification of the conditional gradient sliding (CGS) method that was originally developed in \cite{lan2016conditional}. While the CGS method is a theoretical breakthrough in the theory of projection-free first-order methods since it is the first that reaches the theoretical performance limit, in implementation it requires the knowledge of the Lipschitz constant of the gradient of the objective function $L$ and the number of total gradient evaluations $N$. Such requirements imposes difficulties in the actual implementation, not only because that it can be difficult to choose proper values of $L$ and $N$ that satisfies the conditions for convergence, but also since conservative choices of $L$ and $N$ can deteriorate the practical numerical performance of the CGS method. Our proposed method, called the conditional gradient sliding method with linesearch (CGS-ls), does not require the knowledge of either $L$ and $N$, and is able to terminate early before the theoretically required number of iterations. While more practical in numerical implementation, the theoretical performance of our proposed CGS-ls method is still as good as that of the CGS method. We present numerical experiments to show the efficiency of our proposed method in practice.

preprint2016arXiv

Accelerated gradient sliding for structured convex optimization

Our main goal in this paper is to show that one can skip gradient computations for gradient descent type methods applied to certain structured convex programming (CP) problems. To this end, we first present an accelerated gradient sliding (AGS) method for minimizing the summation of two smooth convex functions with different Lipschitz constants. We show that the AGS method can skip the gradient computation for one of these smooth components without slowing down the overall optimal rate of convergence. This result is much sharper than the classic black-box CP complexity results especially when the difference between the two Lipschitz constants associated with these components is large. We then consider an important class of bilinear saddle point problem whose objective function is given by the summation of a smooth component and a nonsmooth one with a bilinear saddle point structure. Using the aforementioned AGS method for smooth composite optimization and Nesterov's smoothing technique, we show that one only needs ${\cal O}(1/\sqrtε)$ gradient computations for the smooth component while still preserving the optimal ${\cal O}(1/ε)$ overall iteration complexity for solving these saddle point problems. We demonstrate that even more significant savings on gradient computations can be obtained for strongly convex smooth and bilinear saddle point problems.

preprint2014arXiv

Accelerated Schemes For A Class of Variational Inequalities

We propose a novel method, namely the accelerated mirror-prox (AMP) method, for computing the weak solutions of a class of deterministic and stochastic monotone variational inequalities (VI). The main idea of this algorithm is to incorporate a multi-step acceleration scheme into the mirror-prox method. For both deterministic and stochastic VIs, the developed AMP method computes the weak solutions with optimal rate of convergence. In particular, if the monotone operator in VI consists of the gradient of a smooth function, the rate of convergence of the AMP method can be accelerated in terms of its dependence on the Lipschitz constant of the smooth function. For VIs with bounded feasible sets, the estimate of the rate of convergence of the AMP method depends on the diameter of the feasible set. For unbounded VIs, we adopt the modified gap function introduced by Monteiro and Svaiter for solving monotone inclusion, and demonstrate that the rate of convergence of the AMP method depends on the distance from the initial point to the set of strong solutions.

preprint2014arXiv

An Accelerated Linearized Alternating Direction Method of Multipliers

We present a novel framework, namely AADMM, for acceleration of linearized alternating direction method of multipliers (ADMM). The basic idea of AADMM is to incorporate a multi-step acceleration scheme into linearized ADMM. We demonstrate that for solving a class of convex composite optimization with linear constraints, the rate of convergence of AADMM is better than that of linearized ADMM, in terms of their dependence on the Lipschitz constant of the smooth component. Moreover, AADMM is capable to deal with the situation when the feasible region is unbounded, as long as the corresponding saddle point problem has a solution. A backtracking algorithm is also proposed for practical performance.

preprint2014arXiv

Fast Bundle-Level Type Methods for unconstrained and ball-constrained convex optimization

It has been shown in \cite{Lan13-1} that the accelerated prox-level (APL) method and its variant, the uniform smoothing level (USL) method, have optimal iteration complexity for solving black-box and structured convex programming problems without requiring the input of any smoothness information. However, these algorithms require the assumption on the boundedness of the feasible set and their efficiency relies on the solutions of two involved subproblems. These hindered the applicability of these algorithms in solving large-scale and unconstrained optimization problems. In this paper, we first present a generic algorithmic framework to extend these uniformly optimal level methods for solving unconstrained problems. Moreover, we introduce two new variants of level methods, i.e., the fast APL (FAPL) method and the fast USL (FUSL) method, for solving large scale black-box and structured convex programming problems respectively. Both FAPL and FUSL enjoy the same optimal iteration complexity as APL and USL, while the number of subproblems in each iteration is reduced from two to one. Moreover, we present an exact method to solve the only subproblem for these algorithms. As a result, the proposed FAPL and FUSL methods have improved the performance of the APL and USL in practice significantly in terms of both computational time and solution quality. Our numerical results on solving some large-scale least square problems and total variation based image reconstruction have shown great advantages of these new bundle-level type methods over APL, USL, and some other state-of-the-art first-order methods.

preprint2013arXiv

Optimal Primal-Dual Methods for a Class of Saddle Point Problems

We present a novel accelerated primal-dual (APD) method for solving a class of deterministic and stochastic saddle point problems (SPP). The basic idea of this algorithm is to incorporate a multi-step acceleration scheme into the primal-dual method without smoothing the objective function. For deterministic SPP, the APD method achieves the same optimal rate of convergence as Nesterov's smoothing technique. Our stochastic APD method exhibits an optimal rate of convergence for stochastic SPP not only in terms of its dependence on the number of the iteration, but also on a variety of problem parameters. To the best of our knowledge, this is the first time that such an optimal algorithm has been developed for stochastic SPP in the literature. Furthermore, for both deterministic and stochastic SPP, the developed APD algorithms can deal with the situation when the feasible region is unbounded, as long as a saddle point exists. In the unbounded case, we incorporate the modified termination criterion introduced by Monteiro and Svaiter in solving SPP problem posed as monotone inclusion, and demonstrate that the rate of convergence of the APD method depends on the distance from the initial point to the set of optimal solutions.

Yuyuan Ouyang

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Graph topology invariant gradient and sampling complexity for decentralized and stochastic optimization

An Asymptotic Result of Conditional Logistic Regression Estimator

Backtracking linesearch for conditional gradient sliding

Accelerated gradient sliding for structured convex optimization

Accelerated Schemes For A Class of Variational Inequalities

An Accelerated Linearized Alternating Direction Method of Multipliers

Fast Bundle-Level Type Methods for unconstrained and ball-constrained convex optimization

Optimal Primal-Dual Methods for a Class of Saddle Point Problems