Researcher profile

Brian Swenson

Brian Swenson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima

In centralized settings, it is well known that stochastic gradient descent (SGD) avoids saddle points and converges to local minima in nonconvex problems. However, similar guarantees are lacking for distributed first-order algorithms. The paper studies distributed stochastic gradient descent (D-SGD)--a simple network-based implementation of SGD. Conditions under which D-SGD avoids saddle points and converges to local minima are studied. First, we consider the problem of computing critical points. Assuming loss functions are nonconvex and possibly nonsmooth, it is shown that, for each fixed initialization, D-SGD converges to critical points of the loss with probability one. Next, we consider the problem of avoiding saddle points. In this case, we again assume that loss functions may be nonconvex and nonsmooth, but are smooth in a neighborhood of a saddle point. It is shown that, for any fixed initialization, D-SGD avoids such saddle points with probability one. Results are proved by studying the underlying (distributed) gradient flow, using the ordinary differential equation (ODE) method of stochastic approximation, and extending classical techniques from dynamical systems theory such as stable manifolds. Results are proved in the general context of subspace-constrained optimization, of which D-SGD is a special case.

preprint2020arXiv

Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point Evasion

The paper considers distributed gradient flow (DGF) for multi-agent nonconvex optimization. DGF is a continuous-time approximation of distributed gradient descent that is often easier to study than its discrete-time counterpart. The paper has two main contributions. First, the paper considers optimization of nonsmooth, nonconvex objective functions. It is shown that DGF converges to critical points in this setting. The paper then considers the problem of avoiding saddle points. It is shown that if agents' objective functions are assumed to be smooth and nonconvex, then DGF can only converge to a saddle point from a zero-measure set of initial conditions. To establish this result, the paper proves a stable manifold theorem for DGF, which is a fundamental contribution of independent interest. In a companion paper, analogous results are derived for discrete-time algorithms.

preprint2020arXiv

Distributed Gradient Methods for Nonconvex Optimization: Local and Global Convergence Guarantees

The article discusses distributed gradient-descent algorithms for computing local and global minima in nonconvex optimization. For local optimization, we focus on distributed stochastic gradient descent (D-SGD)--a simple network-based variant of classical SGD. We discuss local minima convergence guarantees and explore the simple but critical role of the stable-manifold theorem in analyzing saddle-point avoidance. For global optimization, we discuss annealing-based methods in which slowly decaying noise is added to D-SGD. Conditions are discussed under which convergence to global minima is guaranteed. Numerical examples illustrate the key concepts in the paper.