Source author record

Benjamin Grimmer

Benjamin Grimmer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Universally Optimal Primal-Dual Method for Minimizing Heterogeneous Compositions

This paper proposes a universal algorithm for convex minimization problems of the composite form $g_0(x)+h(g_1(x),\dots, g_m(x)) + u(x)$. We allow each $g_j$ to independently range from being nonsmooth Lipschitz to smooth, from convex to strongly convex, described by notions of Hölder continuous gradients and uniform convexity. Note that, although the objective is built from a heterogeneous combination of such structured components, it does not necessarily possess smoothness, Lipschitzness, or any favorable structure overall other than convexity. Regardless, we provide a universal optimal method in terms of oracle access to (sub)gradients of each $g_j$. The key insight enabling our optimal universal analysis and a core technical contribution is the construction of two new constants, the Approximate Dualized Aggregate smoothness and strong convexity, which combine the benefits of each heterogeneous structure into single quantities amenable to analysis. As a key application, fixing $h$ as the nonpositive indicator function, this model readily captures functionally constrained minimization $g_0(x)+u(x)$ subject to $g_j(x)\leq 0$. In particular, our algorithm and analysis are directly inspired by the smooth constrained minimization method of Zhang and Lan and consequently recover and generalize their accelerated guarantees.

preprint2023arXiv

General Holder Smooth Convergence Rates Follow From Specialized Rates Assuming Growth Bounds

Often in the analysis of first-order methods for both smooth and nonsmooth optimization, assuming the existence of a growth/error bound or KL condition facilitates much stronger convergence analysis. Hence separate analysis is typically needed for the general case and for the growth bounded cases. We give meta-theorems for deriving general convergence rates from those assuming a growth lower bound. Applying this simple but conceptually powerful tool to the proximal point, subgradient, bundle, dual averaging, gradient descent, Frank-Wolfe, and universal accelerated methods immediately recovers their known convergence rates for general convex optimization problems from their specialized rates. New convergence results follow for bundle methods, dual averaging, and Frank-Wolfe. Our results can lift any rate based on Hölder continuous gradients and Hölder growth bounds. Moreover, our theory provides simple proofs of optimal convergence lower bounds under Hölder growth from textbook examples without growth bounds.

preprint2021arXiv

Bundle Method Sketching for Low Rank Semidefinite Programming

In this paper, we show that the bundle method can be applied to solve semidefinite programming problems with a low rank solution without ever constructing a full matrix. To accomplish this, we use recent results from randomly sketching matrix optimization problems and from the analysis of bundle methods. Under strong duality and strict complementarity of SDP, our algorithm produces primal and the dual sequences converging in feasibility at a rate of $\tilde{O}(1/ε)$ and in optimality at a rate of $\tilde{O}(1/ε^2)$. Moreover, our algorithm outputs a low rank representation of its approximate solution with distance to the optimal solution at most $O(\sqrtε)$ within $\tilde{O}(1/ε^2)$ iterations.

preprint2021arXiv

Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated for a range of GAN problems. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in the sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these different limiting behaviors caused by introducing regularization as Hopf Bifurcations.

Benjamin Grimmer

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

A Universally Optimal Primal-Dual Method for Minimizing Heterogeneous Compositions

General Holder Smooth Convergence Rates Follow From Specialized Rates Assuming Growth Bounds

Bundle Method Sketching for Low Rank Semidefinite Programming

Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems