Source author record

Pablo Parrilo

Pablo Parrilo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC math.AC math.AG Symbolic Computation Systems and Control

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Convergence Rate of Incremental Gradient and Newton Methods

The incremental gradient method is a prominent algorithm for minimizing a finite sum of smooth convex functions, used in many contexts including large-scale data processing applications and distributed optimization over networks. It is a first-order method that processes the functions one at a time based on their gradient information. The incremental Newton method, on the other hand, is a second-order variant which exploits additionally the curvature information of the underlying functions and can therefore be faster. In this paper, we focus on the case when the objective function is strongly convex and present fast convergence results for the incremental gradient and incremental Newton methods under the constant and diminishing stepsizes. For a decaying stepsize rule $α_k = Θ(1/k^s)$ with $s \in (0,1]$, we show that the distance of the IG iterates to the optimal solution converges at rate ${\cal O}(1/k^{s})$ (which translates into ${\cal O}(1/k^{2s})$ rate in the suboptimality of the objective value). For $s>1/2$, this improves the previous ${\cal O}(1/\sqrt{k})$ results in distances obtained for the case when functions are non-smooth. We show that to achieve the fastest ${\cal O}(1/k)$ rate, incremental gradient needs a stepsize that requires tuning to the strong convexity parameter whereas the incremental Newton method does not. The results are based on viewing the incremental gradient method as a gradient descent method with gradient errors, devising efficient upper bounds for the gradient error to derive inequalities that relate distances of the consecutive iterates to the optimal solution and finally applying Chung's lemmas from the stochastic approximation literature to these inequalities to determine their asymptotic behavior. In addition, we construct examples to show tightness of our rate results.

preprint2022arXiv

Why Random Reshuffling Beats Stochastic Gradient Descent

We analyze the convergence rate of the random reshuffling (RR) method, which is a randomized first-order incremental algorithm for minimizing a finite sum of convex component functions. RR proceeds in cycles, picking a uniformly random order (permutation) and processing the component functions one at a time according to this order, i.e., at each cycle, each component function is sampled without replacement from the collection. Though RR has been numerically observed to outperform its with-replacement counterpart stochastic gradient descent (SGD), characterization of its convergence rate has been a long standing open question. In this paper, we answer this question by showing that when the component functions are quadratics or smooth and the sum function is strongly convex, RR with iterate averaging and a diminishing stepsize $α_k=Θ(1/k^s)$ for $s\in (1/2,1)$ converges at rate $Θ(1/k^{2s})$ with probability one in the suboptimality of the objective value, thus improving upon the $Ω(1/k)$ rate of SGD. Our analysis draws on the theory of Polyak-Ruppert averaging and relies on decoupling the dependent cycle gradient error into an independent term over cycles and another term dominated by $α_k^2$. This allows us to apply law of large numbers to an appropriately weighted version of the cycle gradient errors, where the weights depend on the stepsize. We also provide high probability convergence rate estimates that shows decay rate of different terms and allows us to propose a modification of RR with convergence rate ${\cal O}(\frac{1}{k^2})$.

preprint2016arXiv

Exploiting chordal structure in polynomial ideals: a Gröbner bases approach

Chordal structure and bounded treewidth allow for efficient computation in numerical linear algebra, graphical models, constraint satisfaction and many other areas. In this paper, we begin the study of how to exploit chordal structure in computational algebraic geometry, and in particular, for solving polynomial systems. The structure of a system of polynomial equations can be described in terms of a graph. By carefully exploiting the properties of this graph (in particular, its chordal completions), more efficient algorithms can be developed. To this end, we develop a new technique, which we refer to as chordal elimination, that relies on elimination theory and Gröbner bases. By maintaining graph structure throughout the process, chordal elimination can outperform standard Gröbner basis algorithms in many cases. The reason is that all computations are done on "smaller" rings, of size equal to the treewidth of the graph. In particular, for a restricted class of ideals, the computational complexity is linear in the number of variables. Chordal structure arises in many relevant applications. We demonstrate the suitability of our methods in examples from graph colorings, cryptography, sensor localization and differential equations.

preprint2014arXiv

A globally convergent incremental Newton method

Motivated by machine learning problems over large data sets and distributed optimization over networks, we develop and analyze a new method called incremental Newton method for minimizing the sum of a large number of strongly convex functions. We show that our method is globally convergent for a variable stepsize rule. We further show that under a gradient growth condition, convergence rate is linear for both variable and constant stepsize rules. By means of an example, we show that without the gradient growth condition, incremental Newton method cannot achieve linear convergence. Our analysis can be extended to study other incremental methods: in particular, we obtain a linear convergence rate result for the incremental Gauss-Newton algorithm under a variable stepsize rule.

preprint2011arXiv

An Optimal Controller Architecture for Poset-Causal Systems

We propose a novel and natural architecture for decentralized control that is applicable whenever the underlying system has the structure of a partially ordered set (poset). This controller architecture is based on the concept of Moebius inversion for posets, and enjoys simple and appealing separation properties, since the closed-loop dynamics can be analyzed in terms of decoupled subsystems. The controller structure provides rich and interesting connections between concepts from order theory such as Moebius inversion and control-theoretic concepts such as state prediction, correction, and separability. In addition, using our earlier results on H_2-optimal decentralized control for arbitrary posets, we prove that the H_2-optimal controller in fact possesses the proposed structure, thereby establishing the optimality of the new controller architecture.