Researcher profile

Andreas Themelis

Andreas Themelis contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2021arXiv

Neural Network Training as an Optimal Control Problem: An Augmented Lagrangian Approach

Training of neural networks amounts to nonconvex optimization problems that are typically solved by using backpropagation and (variants of) stochastic gradient descent. In this work we propose an alternative approach by viewing the training task as a nonlinear optimal control problem. Under this lens, backpropagation amounts to the sequential approach (single shooting) to optimal control, where the states variables have been eliminated. It is well known that single shooting may lead to ill conditioning, and for this reason the simultaneous approach (multiple shooting) is typically preferred. Motivated by this hypothesis, an augmented Lagrangian algorithm is developed that only requires an approximate solution to the Lagrangian subproblems up to a user-defined accuracy. By applying this framework to the training of neural networks, it is shown that the inner Lagrangian subproblems are amenable to be solved using Gauss-Newton iterations. To fully exploit the structure of neural networks, the resulting linear least squares problems are addressed by employing an approach based on forward dynamic programming. Finally, the effectiveness of our method is showcased on regression datasets.

preprint2019arXiv

QPALM: A Newton-type Proximal Augmented Lagrangian Method for Quadratic Programs

We present a proximal augmented Lagrangian based solver for general convex quadratic programs (QPs), relying on semismooth Newton iterations with exact line search to solve the inner subproblems. The exact line search reduces in this case to finding the zero of a one-dimensional monotone, piecewise affine function and can be carried out very efficiently. Our algorithm requires the solution of a linear system at every iteration, but as the matrix to be factorized depends on the active constraints, efficient sparse factorization updates can be employed like in active-set methods. Both primal and dual residuals can be enforced down to strict tolerances and otherwise infeasibility can be detected from intermediate iterates. A C implementation of the proposed algorithm is tested and benchmarked against other state-of-the-art QP solvers for a large variety of problem data and shown to compare favorably against these solvers.

preprint2018arXiv

Douglas-Rachford splitting and ADMM for nonconvex optimization: tight convergence results

Although originally designed and analyzed for convex problems, the alternating direction method of multipliers (ADMM) and its close relatives, Douglas-Rachford splitting (DRS) and Peaceman-Rachford splitting (PRS), have been observed to perform remarkably well when applied to certain classes of structured nonconvex optimization problems. However, partial global convergence results in the nonconvex setting have only recently emerged. In this paper we show how the Douglas-Rachford envelope (DRE), introduced in 2014, can be employed to unify and considerably simplify the theory for devising global convergence guarantees for ADMM, DRS and PRS applied to nonconvex problems under less restrictive conditions, larger prox-stepsizes and over-relaxation parameters than previously known. In fact, our bounds are tight whenever the over-relaxation parameter ranges in $(0,2]$. The analysis of ADMM uses a universal primal equivalence with DRS that generalizes the known duality of the algorithms.

preprint2018arXiv

SuperMann: a superlinearly convergent algorithm for finding fixed points of nonexpansive operators

Operator splitting techniques have recently gained popularity in convex optimization problems arising in various control fields. Being fixed-point iterations of nonexpansive operators, such methods suffer many well known downsides, which include high sensitivity to ill conditioning and parameter selection, and consequent low accuracy and robustness. As universal solution we propose SuperMann, a Newton-type algorithm for finding fixed points of nonexpansive operators. It generalizes the classical Krasnosel'skii-Mann scheme, enjoys its favorable global convergence properties and requires exactly the same oracle. It is based on a novel separating hyperplane projection tailored for nonexpansive mappings which makes it possible to include steps along any direction. In particular, when the directions satisfy a Dennis-Moré condition we show that SuperMann converges superlinearly under mild assumptions, which, surprisingly, do not entail nonsingularity of the Jacobian at the solution but merely metric subregularity. As a result, SuperMann enhances and robustifies all operator splitting schemes for structured convex optimization, overcoming their well known sensitivity to ill conditioning.