Source author record

Andreas Themelis

Andreas Themelis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC

Catalog footprint

What is connected

4works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Neural Network Training as an Optimal Control Problem: An Augmented Lagrangian Approach

Training of neural networks amounts to nonconvex optimization problems that are typically solved by using backpropagation and (variants of) stochastic gradient descent. In this work we propose an alternative approach by viewing the training task as a nonlinear optimal control problem. Under this lens, backpropagation amounts to the sequential approach (single shooting) to optimal control, where the states variables have been eliminated. It is well known that single shooting may lead to ill conditioning, and for this reason the simultaneous approach (multiple shooting) is typically preferred. Motivated by this hypothesis, an augmented Lagrangian algorithm is developed that only requires an approximate solution to the Lagrangian subproblems up to a user-defined accuracy. By applying this framework to the training of neural networks, it is shown that the inner Lagrangian subproblems are amenable to be solved using Gauss-Newton iterations. To fully exploit the structure of neural networks, the resulting linear least squares problems are addressed by employing an approach based on forward dynamic programming. Finally, the effectiveness of our method is showcased on regression datasets.

preprint2019arXiv

QPALM: A Newton-type Proximal Augmented Lagrangian Method for Quadratic Programs

We present a proximal augmented Lagrangian based solver for general convex quadratic programs (QPs), relying on semismooth Newton iterations with exact line search to solve the inner subproblems. The exact line search reduces in this case to finding the zero of a one-dimensional monotone, piecewise affine function and can be carried out very efficiently. Our algorithm requires the solution of a linear system at every iteration, but as the matrix to be factorized depends on the active constraints, efficient sparse factorization updates can be employed like in active-set methods. Both primal and dual residuals can be enforced down to strict tolerances and otherwise infeasibility can be detected from intermediate iterates. A C implementation of the proposed algorithm is tested and benchmarked against other state-of-the-art QP solvers for a large variety of problem data and shown to compare favorably against these solvers.

preprint2018arXiv

Douglas-Rachford splitting and ADMM for nonconvex optimization: tight convergence results

Although originally designed and analyzed for convex problems, the alternating direction method of multipliers (ADMM) and its close relatives, Douglas-Rachford splitting (DRS) and Peaceman-Rachford splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex setting have only recently emerged. In this paper we show how the Douglas-Rachford envelope (DRE), introduced in 2014, can be employed to unify and considerably simplify the theory for devising global convergence guarantees for ADMM, DRS and PRS applied to nonconvex problems under less restrictive conditions, larger prox-stepsizes and over-relaxation parameters than previously known. In fact, our bounds are tight whenever the over-relaxation parameter ranges in $(0,2]$. The analysis of ADMM uses a universal primal equivalence with DRS that generalizes the known duality of the algorithms.

preprint2018arXiv

SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators

Operator splitting techniques have recently gained popularity in convex optimization problems arising in various control fields. Being fixed-point iterations of nonexpansive operators, such methods suffer many well known downsides, which include high sensitivity to ill conditioning and parameter selection, and consequent low accuracy and robustness. As universal solution we propose SuperMann, a Newton-type algorithm for finding fixed points of nonexpansive operators. It generalizes the classical Krasnosel'skii-Mann scheme, enjoys its favorable global convergence properties and requires exactly the same oracle. It is based on a novel separating hyperplane projection tailored for nonexpansive mappings which makes it possible to include steps along any direction. In particular, when the directions satisfy a Dennis-Moré condition we show that SuperMann converges superlinearly under mild assumptions, which, surprisingly, do not entail nonsingularity of the Jacobian at the solution but merely metric subregularity. As a result, SuperMann enhances and robustifies all operator splitting schemes for structured convex optimization, overcoming their well known sensitivity to ill conditioning.

Andreas Themelis

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Neural Network Training as an Optimal Control Problem: An Augmented Lagrangian Approach

QPALM: A Newton-type Proximal Augmented Lagrangian Method for Quadratic Programs

Douglas-Rachford splitting and ADMM for nonconvex optimization: tight convergence results

SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators