Researcher profile

Aaron Sidford

Aaron Sidford contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

Convex optimization with $p$-norm oracles

In recent years, there have been significant advances in efficiently solving $\ell_s$-regression using linear system solvers and $\ell_2$-regression [Adil-Kyng-Peng-Sachdeva, J. ACM&#39;24]. Would efficient smoothed $\ell_p$-norm solvers lead to even faster rates for solving $\ell_s$-regression when $2 \leq p < s$? In this paper, we give an affirmative answer to this question and show how to solve $\ell_s$-regression using $\tilde{O}(n^{\fracν{1+ν}})$ iterations of solving smoothed $\ell_p$ regression problems, where $ν:= \frac{1}{p} - \frac{1}{s}$. To obtain this result, we provide improved accelerated rates for convex optimization problems when given access to an $\ell_p^s(λ)$-proximal oracle, which, for a point $c$, returns the solution of the regularized problem $\min_{x} f(x) + λ||x-c||_p^s$. Additionally, we show that these rates for the $\ell_p^s(λ)$-proximal oracle are optimal for algorithms that query in the span of the outputs of the oracle, and we further apply our techniques to settings of high-order and quasi-self-concordant optimization.

preprint2026arXiv

Solving Matrix Games with Near-Optimal Matvec Complexity

We study the problem of computing an $ε$-approximate Nash equilibrium of a two-player, bilinear game with a bounded payoff matrix $A \in \mathbb{R}^{m \times n}$, when the players&#39; strategies are constrained to lie in simple sets. We provide algorithms which solve this problem in $\tilde{O}(ε^{-2/3})$ matrix-vector multiplies (matvecs) in two well-studied cases: $\ell_1$-$\ell_1$ (or zero-sum) games, where the players&#39; strategies are both in the probability simplex, and $\ell_2$-$\ell_1$ games (encompassing hard-margin SVMs), where the players&#39; strategies are in the unit Euclidean ball and probability simplex respectively. These results improve upon the previous state-of-the-art complexities of $\tilde{O}(ε^{-8/9})$ for $\ell_1$-$\ell_1$ and $\tilde{O}(ε^{-7/9})$ for $\ell_2$-$\ell_1$ due to [KOS &#39;25]. In both settings our results are nearly-optimal as they match lower bounds of [KS &#39;25] up to polylogarithmic factors.

preprint2023arXiv

Quantum Speedups for Zero-Sum Games via Improved Dynamic Gibbs Sampling

We give a quantum algorithm for computing an $ε$-approximate Nash equilibrium of a zero-sum game in a $m \times n$ payoff matrix with bounded entries. Given a standard quantum oracle for accessing the payoff matrix our algorithm runs in time $\widetilde{O}(\sqrt{m + n}\cdot ε^{-2.5} + ε^{-3})$ and outputs a classical representation of the $ε$-approximate Nash equilibrium. This improves upon the best prior quantum runtime of $\widetilde{O}(\sqrt{m + n} \cdot ε^{-3})$ obtained by [vAG19] and the classic $\widetilde{O}((m + n) \cdot ε^{-2})$ runtime due to [GK95] whenever $ε= Ω((m +n)^{-1})$. We obtain this result by designing new quantum data structures for efficiently sampling from a slowly-changing Gibbs distribution.

preprint2022arXiv

High-precision Estimation of Random Walks in Small Space

We provide a deterministic $\tilde{O}(\log N)$-space algorithm for estimating random walk probabilities on undirected graphs, and more generally Eulerian directed graphs, to within inverse polynomial additive error ($ε=1/\mathrm{poly}(N)$) where $N$ is the length of the input. Previously, this problem was known to be solvable by a randomized algorithm using space $O(\log N)$ (following Aleliunas et al., FOCS 79) and by a deterministic algorithm using space $O(\log^{3/2} N)$ (Saks and Zhou, FOCS 95 and JCSS 99), both of which held for arbitrary directed graphs but had not been improved even for undirected graphs. We also give improvements on the space complexity of both of these previous algorithms for non-Eulerian directed graphs when the error is negligible ($ε=1/N^{ω(1)}$), generalizing what Hoza and Zuckerman (FOCS 18) recently showed for the special case of distinguishing whether a random walk probability is $0$ or greater than $ε$. We achieve these results by giving new reductions between powering Eulerian random-walk matrices and inverting Eulerian Laplacian matrices, providing a new notion of spectral approximation for Eulerian graphs that is preserved under powering, and giving the first deterministic $\tilde{O}(\log N)$-space algorithm for inverting Eulerian Laplacian matrices. The latter algorithm builds on the work of Murtagh et al. (FOCS 17) that gave a deterministic $\tilde{O}(\log N)$-space algorithm for inverting undirected Laplacian matrices, and the work of Cohen et al. (FOCS 19) that gave a randomized $\tilde{O}(N)$-time algorithm for inverting Eulerian Laplacian matrices. A running theme throughout these contributions is an analysis of &#34;cycle-lifted graphs&#34;, where we take a graph and &#34;lift&#34; it to a new graph whose adjacency matrix is the tensor product of the original adjacency matrix and a directed cycle (or variants of one).

preprint2022arXiv

Improved Lower Bounds for Submodular Function Minimization

We provide a generic technique for constructing families of submodular functions to obtain lower bounds for submodular function minimization (SFM). Applying this technique, we prove that any deterministic SFM algorithm on a ground set of $n$ elements requires at least $Ω(n \log n)$ queries to an evaluation oracle. This is the first super-linear query complexity lower bound for SFM and improves upon the previous best lower bound of $2n$ given by [Graur et al., ITCS 2020]. Using our construction, we also prove that any (possibly randomized) parallel SFM algorithm, which can make up to $\mathsf{poly}(n)$ queries per round, requires at least $Ω(n / \log n)$ rounds to minimize a submodular function. This improves upon the previous best lower bound of $\tildeΩ(n^{1/3})$ rounds due to [Chakrabarty et al., FOCS 2021], and settles the parallel complexity of query-efficient SFM up to logarithmic factors due to a recent advance in [Jiang, SODA 2021].

preprint2022arXiv

RECAPP: Crafting a More Efficient Catalyst for Convex Optimization

The accelerated proximal point algorithm (APPA), also known as &#34;Catalyst&#34;, is a well-established reduction from convex optimization to approximate proximal point computation (i.e., regularized minimization). This reduction is conceptually elegant and yields strong convergence rate guarantees. However, these rates feature an extraneous logarithmic term arising from the need to compute each proximal point to high accuracy. In this work, we propose a novel Relaxed Error Criterion for Accelerated Proximal Point (RECAPP) that eliminates the need for high accuracy subproblem solutions. We apply RECAPP to two canonical problems: finite-sum and max-structured minimization. For finite-sum problems, we match the best known complexity, previously obtained by carefully-designed problem-specific algorithms. For minimizing $\max_y f(x,y)$ where $f$ is convex in $x$ and strongly-concave in $y$, we improve on the best known (Catalyst-based) bound by a logarithmic factor.

preprint2022arXiv

Semi-Random Sparse Recovery in Nearly-Linear Time

Sparse recovery is one of the most fundamental and well-studied inverse problems. Standard statistical formulations of the problem are provably solved by general convex programming techniques and more practical, fast (nearly-linear time) iterative methods. However, these latter &#34;fast algorithms&#34; have previously been observed to be brittle in various real-world settings. We investigate the brittleness of fast sparse recovery algorithms to generative model changes through the lens of studying their robustness to a &#34;helpful&#34; semi-random adversary, a framework which tests whether an algorithm overfits to input assumptions. We consider the following basic model: let $\mathbf{A} \in \mathbb{R}^{n \times d}$ be a measurement matrix which contains an unknown subset of rows $\mathbf{G} \in \mathbb{R}^{m \times d}$ which are bounded and satisfy the restricted isometry property (RIP), but is otherwise arbitrary. Letting $x^\star \in \mathbb{R}^d$ be $s$-sparse, and given either exact measurements $b = \mathbf{A} x^\star$ or noisy measurements $b = \mathbf{A} x^\star + ξ$, we design algorithms recovering $x^\star$ information-theoretically optimally in nearly-linear time. We extend our algorithm to hold for weaker generative models relaxing our planted RIP assumption to a natural weighted variant, and show that our method&#39;s guarantees naturally interpolate the quality of the measurement matrix to, in some parameter regimes, run in sublinear time. Our approach differs from prior fast iterative methods with provable guarantees under semi-random generative models: natural conditions on a submatrix which make sparse recovery tractable are NP-hard to verify. We design a new iterative method tailored to the geometry of sparse recovery which is provably robust to our semi-random model. We hope our approach opens the door to new robust, efficient algorithms for natural statistical inverse problems.

preprint2022arXiv

Sharper Rates for Separable Minimax and Finite Sum Optimization via Primal-Dual Extragradient Methods

We design accelerated algorithms with improved rates for several fundamental classes of optimization problems. Our algorithms all build upon techniques related to the analysis of primal-dual extragradient methods via relative Lipschitzness proposed recently by [CST21]. (1) Separable minimax optimization. We study separable minimax optimization problems $\min_x \max_y f(x) - g(y) + h(x, y)$, where $f$ and $g$ have smoothness and strong convexity parameters $(L^x, μ^x)$, $(L^y, μ^y)$, and $h$ is convex-concave with a $(Λ^{xx}, Λ^{xy}, Λ^{yy})$-blockwise operator norm bounded Hessian. We provide an algorithm with gradient query complexity $\tilde{O}\left(\sqrt{\frac{L^{x}}{μ^{x}}} + \sqrt{\frac{L^{y}}{μ^{y}}} + \frac{Λ^{xx}}{μ^{x}} + \frac{Λ^{xy}}{\sqrt{μ^{x}μ^{y}}} + \frac{Λ^{yy}}{μ^{y}}\right)$. Notably, for convex-concave minimax problems with bilinear coupling (e.g.\ quadratics), where $Λ^{xx} = Λ^{yy} = 0$, our rate matches a lower bound of [ZHZ19]. (2) Finite sum optimization. We study finite sum optimization problems $\min_x \frac{1}{n}\sum_{i\in[n]} f_i(x)$, where each $f_i$ is $L_i$-smooth and the overall problem is $μ$-strongly convex. We provide an algorithm with gradient query complexity $\tilde{O}\left(n + \sum_{i\in[n]} \sqrt{\frac{L_i}{nμ}} \right)$. Notably, when the smoothness bounds $\{L_i\}_{i\in[n]}$ are non-uniform, our rate improves upon accelerated SVRG [LMH15, FGKS15] and Katyusha [All17] by up to a $\sqrt{n}$ factor. (3) Minimax finite sums. We generalize our algorithms for minimax and finite sum optimization to solve a natural family of minimax finite sum optimization problems at an accelerated rate, encapsulating both above results up to a logarithmic factor.

preprint2021arXiv

Complexity of Highly Parallel Non-Smooth Convex Optimization

A landmark result of non-smooth convex optimization is that gradient descent is an optimal algorithm whenever the number of computed gradients is smaller than the dimension $d$. In this paper we study the extension of this result to the parallel optimization setting. Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer $\mathrm{poly}(d)$ gradient queries in parallel. We show that in this case gradient descent is optimal only up to $\tilde{O}(\sqrt{d})$ rounds of interactions with the oracle. The lower bound improves upon a decades old construction by Nemirovski which proves optimality only up to $d^{1/3}$ rounds (as recently observed by Balkanski and Singer), and the suboptimality of gradient descent after $\sqrt{d}$ rounds was already observed by Duchi, Bartlett and Wainwright. In the latter regime we propose a new method with improved complexity, which we conjecture to be optimal. The analysis of this new method is based upon a generalized version of the recent results on optimal acceleration for highly smooth convex optimization.

preprint2020arXiv

A General Framework for Symmetric Property Estimation

In this paper we provide a general framework for estimating symmetric properties of distributions from i.i.d. samples. For a broad class of symmetric properties we identify the easy region where empirical estimation works and the difficult region where more complex estimators are required. We show that by approximately computing the profile maximum likelihood (PML) distribution \cite{ADOS16} in this difficult region we obtain a symmetric property estimation framework that is sample complexity optimal for many properties in a broader parameter regime than previous universal estimation approaches based on PML. The resulting algorithms based on these pseudo PML distributions are also more practical.

preprint2020arXiv

Acceleration with a Ball Optimization Oracle

Consider an oracle which takes a point $x$ and returns the minimizer of a convex function $f$ in an $\ell_2$ ball of radius $r$ around $x$. It is straightforward to show that roughly $r^{-1}\log\frac{1}ε$ calls to the oracle suffice to find an $ε$-approximate minimizer of $f$ in an $\ell_2$ unit ball. Perhaps surprisingly, this is not optimal: we design an accelerated algorithm which attains an $ε$-approximate minimizer with roughly $r^{-2/3} \log \frac{1}ε$ oracle queries, and give a matching lower bound. Further, we implement ball optimization oracles for functions with locally stable Hessians using a variant of Newton&#39;s method. The resulting algorithm applies to a number of problems of practical and theoretical import, improving upon previous results for logistic and $\ell_\infty$ regression and achieving guarantees comparable to the state-of-the-art for $\ell_p$ regression.

preprint2020arXiv

Constant Girth Approximation for Directed Graphs in Subquadratic Time

In this paper we provide a $\tilde{O}(m\sqrt{n})$ time algorithm that computes a $3$-multiplicative approximation of the girth of a $n$-node $m$-edge directed graph with non-negative edge lengths. This is the first algorithm which approximates the girth of a directed graph up to a constant multiplicative factor faster than All-Pairs Shortest Paths (APSP) time, i.e. $O(mn)$. Additionally, for any integer $k \ge 1$, we provide a deterministic algorithm for a $O(k\log\log n)$-multiplicative approximation to the girth in directed graphs in $\tilde{O}(m^{1+1/k})$ time. Combining the techniques from these two results gives us an algorithm for a $O(k\log k)$-multiplicative approximation to the girth in directed graphs in $\tilde{O}(m^{1+1/k})$ time. Our results naturally also provide algorithms for improved constructions of roundtrip spanners, the analog of spanners in directed graphs. The previous fastest algorithms for these problems either ran in All-Pairs Shortest Paths (APSP) time, i.e. $O(mn)$, or were due Pachocki et al. (PRSTV18) which provided a randomized algorithm that for any integer $k \ge 1$ in time $\tilde{O}(m^{1+1/k})$ computed with high probability a $O(k\log n)$ multiplicative approximation of the girth. Our first algorithm constitutes the first sub-APSP-time algorithm for approximating the girth to constant accuracy, our second removes the need for randomness and improves the approximation factor in Pachocki et al. (PRSTV18), and our third is the first time versus quality trade-off for obtaining constant approximations.

preprint2020arXiv

Coordinate Methods for Accelerating $\ell_\infty$ Regression and Faster Approximate Maximum Flow

We provide faster algorithms for approximately solving $\ell_{\infty}$ regression, a fundamental problem prevalent in both combinatorial and continuous optimization. In particular, we provide accelerated coordinate descent methods capable of provably exploiting dynamic measures of coordinate smoothness, and apply them to $\ell_\infty$ regression over a box to give algorithms which converge in $k$ iterations at a $O(1/k)$ rate. Our algorithms can be viewed as an alternative approach to the recent breakthrough result of Sherman [She17] which achieves a similar runtime improvement over classic algorithmic approaches, i.e. smoothing and gradient descent, which either converge at a $O(1/\sqrt{k})$ rate or have running times with a worse dependence on problem parameters. Our runtimes match those of [She17] across a broad range of parameters and achieve improvement in certain structured cases. We demonstrate the efficacy of our result by providing faster algorithms for the well-studied maximum flow problem. Directly leveraging our accelerated $\ell_\infty$ regression algorithms imply a $\tilde{O}\left(m + \sqrt{mn}/ε\right)$ runtime to compute an $ε$-approximate maximum flow for an undirected graph with $m$ edges and $n$ vertices, generically improving upon the previous best known runtime of $\tilde{O}\left(m/ε\right)$ in [She17] whenever the graph is slightly dense. We further design an algorithm adapted to the structure of the regression problem induced by maximum flow obtaining a runtime of $\tilde{O}\left(m + \max(n, \sqrt{ns})/ε\right)$, where $s$ is the squared $\ell_2$ norm of the congestion of any optimal flow. Moreover, we show how to leverage this result to achieve improved exact algorithms for maximum flow on a variety of unit capacity graphs. We hope that our work serves as an important step towards achieving even faster maximum flow algorithms.

preprint2020arXiv

Coordinate Methods for Matrix Games

We develop primal-dual coordinate methods for solving bilinear saddle-point problems of the form $\min_{x \in \mathcal{X}} \max_{y\in\mathcal{Y}} y^\top A x$ which contain linear programming, classification, and regression as special cases. Our methods push existing fully stochastic sublinear methods and variance-reduced methods towards their limits in terms of per-iteration complexity and sample complexity. We obtain nearly-constant per-iteration complexity by designing efficient data structures leveraging Taylor approximations to the exponential and a binomial heap. We improve sample complexity via low-variance gradient estimators using dynamic sampling distributions that depend on both the iterates and the magnitude of the matrix entries. Our runtime bounds improve upon those of existing primal-dual methods by a factor depending on sparsity measures of the $m$ by $n$ matrix $A$. For example, when rows and columns have constant $\ell_1/\ell_2$ norm ratios, we offer improvements by a factor of $m+n$ in the fully stochastic setting and $\sqrt{m+n}$ in the variance-reduced setting. We apply our methods to computational geometry problems, i.e. minimum enclosing ball, maximum inscribed ball, and linear regression, and obtain improved complexity bounds. For linear regression with an elementwise nonnegative matrix, our guarantees improve on exact gradient methods by a factor of $\sqrt{\mathrm{nnz}(A)/(m+n)}$.

preprint2020arXiv

Efficiently Solving MDPs with Stochastic Mirror Descent

We present a unified framework based on primal-dual stochastic mirror descent for approximately solving infinite-horizon Markov decision processes (MDPs) given a generative model. When applied to an average-reward MDP with $A_{tot}$ total state-action pairs and mixing time bound $t_{mix}$ our method computes an $ε$-optimal policy with an expected $\widetilde{O}(t_{mix}^2 A_{tot} ε^{-2})$ samples from the state-transition matrix, removing the ergodicity dependence of prior art. When applied to a $γ$-discounted MDP with $A_{tot}$ total state-action pairs our method computes an $ε$-optimal policy with an expected $\widetilde{O}((1-γ)^{-4} A_{tot} ε^{-2})$ samples, matching the previous state-of-the-art up to a $(1-γ)^{-1}$ factor. Both methods are model-free, update state values and policies simultaneously, and run in time linear in the number of samples taken. We achieve these results through a more general stochastic mirror descent framework for solving bilinear saddle-point problems with simplex and box domains and we demonstrate the flexibility of this framework by providing further applications to constrained MDPs.

preprint2020arXiv

Faster Divergence Maximization for Faster Maximum Flow

In this paper we provide an algorithm which given any $m$-edge $n$-vertex directed graph with integer capacities at most $U$ computes a maximum $s$-$t$ flow for any vertices $s$ and $t$ in $m^{4/3+o(1)}U^{1/3}$ time. This improves upon the previous best running times of $m^{11/8+o(1)}U^{1/4}$ (Liu Sidford 2019), $\tilde{O}(m \sqrt{n} \log U)$ (Lee Sidford 2014), and $O(mn)$ (Orlin 2013) when the graph is not too dense or has large capacities. To achieve the results this paper we build upon previous algorithmic approaches to maximum flow based on interior point methods (IPMs). In particular, we overcome a key bottleneck of previous advances in IPMs for maxflow (Mądry 2013, Mądry 2016, Liu Sidford 2019), which make progress by maximizing the energy of local $\ell_2$ norm minimizing electric flows. We generalize this approach and instead maximize the divergence of flows which minimize the Bregman divergence distance with respect to the weighted logarithmic barrier. This allows our algorithm to avoid dependencies on the $\ell_4$ norm that appear in other IPM frameworks (e.g. Cohen Mądry Sankowski Vladu 2017, Axiotis Mądry Vladu 2020). Further, we show that smoothed $\ell_2$-$\ell_p$ flows (Kyng, Peng, Sachdeva, Wang 2019), which we previously used to efficiently maximize energy (Liu Sidford 2019), can also be used to efficiently maximize divergence, thereby yielding our desired runtimes. We believe both this generalized view of energy maximization and generalized flow solvers we develop may be of further interest.

preprint2020arXiv

Solving Linear Programs with Sqrt(rank) Linear System Solves

We present an algorithm that given a linear program with $n$ variables, $m$ constraints, and constraint matrix $A$, computes an $ε$-approximate solution in $\tilde{O}(\sqrt{rank(A)}\log(1/ε))$ iterations with high probability. Each iteration of our method consists of solving $\tilde{O}(1)$ linear systems and additional nearly linear time computation, improving by a factor of $\tildeΩ((m/rank(A))^{1/2})$ over the previous fastest method with this iteration cost due to Renegar (1988). Further, we provide a deterministic polynomial time computable $\tilde{O}(rank(A))$-self-concordant barrier function for the polytope, resolving an open question of Nesterov and Nemirovski (1994) on the theory of &#34;universal barriers&#34; for interior point methods. Applying our techniques to the linear program formulation of maximum flow yields an $\tilde{O}(|E|\sqrt{|V|}\log(U))$ time algorithm for solving the maximum flow problem on directed graphs with $|E|$ edges, $|V|$ vertices, and integer capacities of size at most $U$. This improves upon the previous fastest polynomial running time of $O(|E|\min\{|E|^{1/2},|V|^{2/3}\}\log(|V|^{2}/|E|)\log(U))$ achieved by Goldberg and Rao (1998). In the special case of solving dense directed unit capacity graphs our algorithm improves upon the previous fastest running times of $O(|E|\min\{|E|^{1/2},|V|^{2/3}\})$ achieved by Even and Tarjan (1975) and Karzanov (1973) and of $\tilde{O}(|E|^{10/7})$ achieved more recently by Mądry (2013).

preprint2020arXiv

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d.\ samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $\exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.

preprint2020arXiv

Towards Optimal Running Times for Optimal Transport

In this work, we provide faster algorithms for approximating the optimal transport distance, e.g. earth mover&#39;s distance, between two discrete probability distributions $μ, ν\in Δ^n$. Given a cost function $C : [n] \times [n] \to \mathbb{R}_{\geq 0}$ where $C(i,j) \leq 1$ quantifies the penalty of transporting a unit of mass from $i$ to $j$, we show how to compute a coupling $X$ between $r$ and $c$ in time $\widetilde{O}\left(n^2 /ε\right)$ whose expected transportation cost is within an additive $ε$ of optimal. This improves upon the previously best known running time for this problem of $\widetilde{O}\left(\text{min}\left\{ n^{9/4}/ε, n^2/ε^2 \right\}\right)$. We achieve our results by providing reductions from optimal transport to canonical optimization problems for which recent algorithmic efforts have provided nearly-linear time algorithms. Leveraging nearly linear time algorithms for solving packing linear programs and for solving the matrix balancing problem, we obtain two separate proofs of our stated running time. Further, one of our algorithms is easily parallelized and can be implemented with depth $\widetilde{O}(1/ε)$. Moreover, we show that further algorithmic improvements to our result would be surprising in the sense that any improvement would yield an $o(n^{2.5})$ algorithm for \textit{maximum cardinality bipartite matching}, for which currently the only known algorithms for achieving such a result are based on fast-matrix multiplication.