Source author record

Lindon Roberts

Lindon Roberts appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning

Catalog footprint

What is connected

5works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Direct search based on probabilistic descent in reduced spaces

In this paper, we study a generic direct-search algorithm in which the polling directions are defined using random subspaces. Complexity guarantees for such an approach are derived thanks to probabilistic properties related to both the subspaces and the directions used within these subspaces. Our analysis crucially extends previous deterministic and probabilistic arguments by relaxing the need for directions to be deterministically bounded in norm. As a result, our approach encompasses a wide range of new optimal polling strategies that can be characterized using our subspace and direction properties. By leveraging results on random subspace embeddings and sketching matrices, we show that better complexity bounds are obtained for randomized instances of our framework. A numerical investigation confirms the benefit of randomization, particularly when done in subspaces, when solving problems of moderately large dimension.

preprint2022arXiv

A simplified convergence theory for Byzantine resilient stochastic gradient descent

In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified convergence theory for the generic Byzantine Resilient SGD method originally proposed by Blanchard et al. [NeurIPS 2017]. Compared to the existing analysis, we shown convergence to a stationary point in expectation under standard assumptions on the (possibly nonconvex) objective function and flexible assumptions on the stochastic gradients.

preprint2022arXiv

Model-Based Derivative-Free Methods for Convex-Constrained Optimization

We present a model-based derivative-free method for optimization subject to general convex constraints, which we assume are unrelaxable and accessed only through a projection operator that is cheap to evaluate. We prove global convergence and a worst-case complexity of $O(ε^{-2})$ iterations and objective evaluations for nonconvex functions, matching results for the unconstrained case. We introduce new, weaker requirements on model accuracy compared to existing methods. As a result, sufficiently accurate interpolation models can be constructed only using feasible points. We develop a comprehensive theory of interpolation set management in this regime for linear and composite linear models. We implement our approach for nonlinear least-squares problems and demonstrate strong practical performance compared to general-purpose solvers.

preprint2021arXiv

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

We introduce a general framework for large-scale model-based derivative-free optimization based on iterative minimization within random subspaces. We present a probabilistic worst-case complexity analysis for our method, where in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. This framework is specialized to nonlinear least-squares problems, with a model-based framework based on the Gauss-Newton method. This method achieves scalability by constructing local linear interpolation models to approximate the Jacobian, and computes new steps at each iteration in a subspace with user-determined dimension. We then describe a practical implementation of this framework, which we call DFBGN. We outline efficient techniques for selecting the interpolation points and search subspace, yielding an implementation that has a low per-iteration linear algebra cost (linear in the problem dimension) while also achieving fast objective decrease as measured by evaluations. Extensive numerical results demonstrate that DFBGN has improved scalability, yielding strong performance on large-scale nonlinear least-squares problems.

preprint2020arXiv

Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems

Derivative-free - or zeroth-order - optimization (DFO) has gained recent attention for its ability to solve problems in a variety of application areas, including machine learning, particularly involving objectives which are stochastic and/or expensive to compute. In this work, we develop a novel model-based DFO method for solving nonlinear least-squares problems. We improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems.