Source author record

Alan Edelman

Alan Edelman appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

25works

29topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs

The reduction of a banded matrix to bidiagonal form is a critical step in the calculation of Singular Values, a cornerstone of scientific computing and AI. Although inherently parallel, this step has traditionally been considered unsuitable for GPUs due to its memory-bound nature. However, recent advances in GPU architectures, such as increased L1 memory per Streaming Multiprocessor or Compute Unit and larger L2 caches, have shifted this paradigm. In this work, we present the first GPU-accelerated algorithm for reducing a banded matrix to bidiagonal form, integrated into open-source software package NextLA$.$jl. Our algorithm builds on prior multicore CPU cache-efficient bulge chasing methods, adapted to modern GPU architecture to optimize throughput. Leveraging Julia's high-level array abstractions and KernelAbstractions, we implement a single function that is both hardware-agnostic and data-precision-aware, running efficiently across NVIDIA, AMD, Intel, and Apple Metal GPUs. We develop a hardware-aware performance model to guide tuning and identify key hyperparameters that govern optimal GPU performance for memory-bound workloads. We show that such workloads, when carefully optimized, can achieve substantial speed-ups on modern GPUs: our implementation outperforms multithreaded CPU libraries PLASMA and SLATE starting from matrix sizes as small as 1024 x 1024, and achieves over 100x speed-up on 32k x 32k matrices. Moreover, the algorithm's performance scales linearly with the matrix bandwidth, enabling efficient reduction of matrices with larger bandwidths - previously considered impractical.

preprint2023arXiv

Signal Enhancement for Magnetic Navigation Challenge Problem

Harnessing the magnetic field of the Earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field measurements from the magnetometer encompass the magnetic field from not just the Earth, but also from the vehicle on which it is mounted. It is difficult to separate the Earth magnetic anomaly field, which is crucial for navigation, from the total magnetic field reading from the sensor. The purpose of this challenge problem is to decouple the Earth and aircraft magnetic signals in order to derive a clean signal from which to perform magnetic navigation. Baseline testing on the dataset has shown that the Earth magnetic field can be extracted from the total magnetic field using machine learning (ML). The challenge is to remove the aircraft magnetic field from the total magnetic field using a trained model. This challenge offers an opportunity to construct an effective model for removing the aircraft magnetic field from the dataset by using a scientific machine learning (SciML) approach comprised of an ML algorithm integrated with the physics of magnetic navigation.

preprint2022arXiv

AutoMat: Accelerated Computational Electrochemical systems Discovery

Large-scale electrification is vital to addressing the climate crisis, but several scientific and technological challenges remain to fully electrify both the chemical industry and transportation. In both of these areas, new electrochemical materials will be critical, but their development currently relies heavily on human-time-intensive experimental trial and error and computationally expensive first-principles, meso-scale and continuum simulations. We present an automated workflow, AutoMat, that accelerates these computational steps by introducing both automated input generation and management of simulations across scales from first principles to continuum device modeling. Furthermore, we show how to seamlessly integrate multi-fidelity predictions such as machine learning surrogates or automated robotic experiments "in-the-loop". The automated framework is implemented with design space search techniques to dramatically accelerate the overall materials discovery pipeline by implicitly learning design features that optimize device performance across several metrics. We discuss the benefits of AutoMat using examples in electrocatalysis and energy storage and highlight lessons learned.

preprint2022arXiv

Fifty Three Matrix Factorizations: A systematic approach

The success of matrix factorizations such as the singular value decomposition (SVD) has motivated the search for even more factorizations. We catalog 53 matrix factorizations, most of which we believe to be new. Our systematic approach, inspired by the generalized Cartan decomposition of Lie theory, also encompasses known factorizations such as the SVD, the symmetric eigendecomposition, the CS decomposition, the hyperbolic SVD, structured SVDs, the Takagi factorization, and others thereby covering familiar matrix factorizations as well as ones that were waiting to be discovered. We suggest that Lie theory has one way or another been lurking hidden in the foundations of the very successful field of matrix computations with applications routinely used in so many areas of computation. In this paper, we investigate consequences of the Cartan decomposition and the little known generalized Cartan decomposition for matrix factorizations. We believe that these factorizations once properly identified can lead to further work on algorithmic computations and applications.

preprint2022arXiv

High-performance symbolic-numerics via multiple dispatch

As mathematical computing becomes more democratized in high-level languages, high-performance symbolic-numeric systems are necessary for domain scientists and engineers to get the best performance out of their machine without deep knowledge of code optimization. Naturally, users need different term types either to have different algebraic properties for them, or to use efficient data structures. To this end, we developed Symbolics.jl, an extendable symbolic system which uses dynamic multiple dispatch to change behavior depending on the domain needs. In this work we detail an underlying abstract term interface which allows for speed without sacrificing generality. We show that by formalizing a generic API on actions independent of implementation, we can retroactively add optimized data structures to our system without changing the pre-existing term rewriters. We showcase how this can be used to optimize term construction and give a 113x acceleration on general symbolic transformations. Further, we show that such a generic API allows for complementary term-rewriting implementations. We demonstrate the ability to swap between classical term-rewriting simplifiers and e-graph-based term-rewriting simplifiers. We showcase an e-graph ruleset which minimizes the number of CPU cycles during expression evaluation, and demonstrate how it simplifies a real-world reaction-network simulation to halve the runtime. Additionally, we show a reaction-diffusion partial differential equation solver which is able to be automatically converted into symbolic expressions via multiple dispatch tracing, which is subsequently accelerated and parallelized to give a 157x simulation speedup. Together, this presents Symbolics.jl as a next-generation symbolic-numeric computing environment geared towards modeling and simulation.

preprint2022arXiv

On the Cartan Decomposition for Classical Random Matrix Ensembles

We complete Dyson's dream by cementing the links between symmetric spaces and classical random matrix ensembles. Previous work has focused on a one-to-one correspondence between symmetric spaces and many but not all of the classical random matrix ensembles. This work shows that we can completely capture all of the classical random matrix ensembles from Cartan's symmetric spaces through the use of alternative coordinate systems. In the end, we have to let go of the notion of a one-to-one correspondence. We emphasize that the KAK decomposition traditionally favored by mathematicians is merely one coordinate system on the symmetric space, albeit a beautiful one. However, other matrix factorizations, especially the generalized singular value decomposition from numerical linear algebra reveal themselves to be perfectly valid coordinate systems revealing that one symmetric space can lead to many classical random matrix theories. We establish the connection between this numerical linear algebra viewpoint and the theory of generalized Cartan decomposition. This in turn allows us to produce yet more random matrix theories from a single symmetric space. Yet again these random matrix theories arise from matrix factorizations, through ones that we are not aware have appeared in the literature.

preprint2022arXiv

On the structure of the solutions to the matrix equation $G^*JG=J$

We study the mathematical structure of the solution set (and its tangent space) to the matrix equation $G^*JG=J$ for a given square matrix $J$. In the language of pure mathematics, this is a Lie group which is the isometry group for a bilinear (or a sesquilinear) form. Generally these groups are described as intersections of a few special groups. The tangent space to $\{G: G^*JG=J \}$ consists of solutions to the linear matrix equation $X^*J+JX=0$. For the complex case, the solution set of this linear equation was computed by De Ter{á}n and Dopico. We found that on its own, the equation $X^*J+JX=0$ is hard to solve. By throwing into the mix the complementary linear equation $X^*J-JX=0$, we find that rather than increasing the complexity, we reduce the complexity. Not only is it possible to now solve the original problem, but we can approach the broader algebraic and geometric structure. One implication is that the two equations form an $\mathfrak{h}$ and $\mathfrak{m}$ pair familiar in the study of pseudo-Riemannian symmetric spaces. We explicitly demonstrate the computation of the solutions to the equation $X^*J\pm XJ=0$ for real and complex matrices. However, any real, complex or quaternionic case with an arbitrary involution (e.g., transpose, conjugate transpose, and the various quaternion transposes) can be effectively solved with the same strategy. We provide numerical examples and visualizations.

preprint2016arXiv

Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia

Convolutions have long been regarded as fundamental to applied mathematics, physics and engineering. Their mathematical elegance allows for common tasks such as numerical differentiation to be computed efficiently on large data sets. Efficient computation of convolutions is critical to artificial intelligence in real-time applications, like machine vision, where convolutions must be continuously and efficiently computed on tens to hundreds of kilobytes per second. In this paper, we explore how convolutions are used in fundamental machine vision applications. We present an accelerated n-dimensional convolution package in the high performance computing language, Julia, and demonstrate its efficacy in solving the time to contact problem for machine vision. Results are measured against synthetically generated videos and quantitatively assessed according to their mean squared error from the ground truth. We achieve over an order of magnitude decrease in compute time and allocated memory for comparable machine vision applications. All code is packaged and integrated into the official Julia Package Manager to be used in various other scenarios.

preprint2016arXiv

Beyond universality in random matrix theory

In order to have a better understanding of finite random matrices with non-Gaussian entries, we study the $1/N$ expansion of local eigenvalue statistics in both the bulk and at the hard edge of the spectrum of random matrices. This gives valuable information about the smallest singular value not seen in universality laws. In particular, we show the dependence on the fourth moment (or the kurtosis) of the entries. This work makes use of the so-called complex Gaussian divisible ensembles for both Wigner and sample covariance matrices.

preprint2016arXiv

Julia Implementation of the Dynamic Distributed Dimensional Data Model

Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative arrays. In this work, we present an implementation of D4M in Julia and describe how it enables and facilitates data analysis. Several experiments showcase scalable performance in our new Julia version as compared to the original Matlab implementation.

preprint2015arXiv

Integral geometry for Markov chain Monte Carlo: overcoming the curse of search-subspace dimensionality

We introduce a method that uses the Cauchy-Crofton formula and a new curvature formula from integral geometry to reweight the sampling probabilities of Metropolis-within-Gibbs algorithms in order to increase their convergence speed. We consider algorithms that sample from a probability density conditioned on a manifold $\mathcal{M}$. Our method exploits the symmetries of the algorithms' isotropic random search-direction subspaces to analytically average out the variance in the intersection volume caused by the orientation of the search-subspace with respect to the manifold $\mathcal{M}$ it intersects. This variance can grow exponentially with the dimension of the search-subspace, greatly slowing down the algorithm. Eliminating this variance allows us to use search-subspaces of dimensions many times greater than would otherwise be possible, allowing us to sample very rare events that a lower-dimensional search-subspace would be unlikely to intersect. To extend this method to events that are rare for reasons other than their support $\mathcal{M}$ having a lower dimension, we formulate and prove a new theorem in integral geometry that makes use of the curvature form of the Chern-Gauss-Bonnet theorem to reweight sampling probabilities. On the side, we also apply our theorem to obtain new theoretical bounds for the volumes of real algebraic manifolds. Finally, we demonstrate the computational effectiveness and speedup of our method by numerically applying it to the conditional stochastic Airy operator sampling problem in random matrix theory.

preprint2015arXiv

Julia: A Fresh Approach to Numerical Computing

Bridging cultures that have often been distant, Julia combines expertise from the diverse fields of computer science and computational science to create a new approach to numerical computing. Julia is designed to be easy and fast. Julia questions notions generally held as "laws of nature" by practitioners of numerical computing: 1. High-level dynamic programs have to be slow. 2. One must prototype in one language and then rewrite in another language for speed or deployment, and 3. There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts. We introduce the Julia programming language and its design --- a dance between specialization and abstraction. Specialization allows for custom treatment. Multiple dispatch, a technique from computer science, picks the right algorithm for the right circumstance. Abstraction, what good computation is really about, recognizes what remains the same after differences are stripped away. Abstractions in mathematics are captured as code through another technique from computer science, generic programming. Julia shows that one can have machine performance without sacrificing human convenience.

preprint2015arXiv

Random Triangle Theory with Geometry and Applications

What is the probability that a random triangle is acute? We explore this old question from a modern viewpoint, taking into account linear algebra, shape theory, numerical analysis, random matrix theory, the Hopf fibration, and much much more. One of the best distributions of random triangles takes all six vertex coordinates as independent standard Gaussians. Six can be reduced to four by translation of the center to $(0,0)$ or reformulation as a 2x2 matrix problem. In this note, we develop shape theory in its historical context for a wide audience. We hope to encourage other to look again (and differently) at triangles. We provide a new constructive proof, using the geometry of parallelians, of a central result of shape theory: Triangle shapes naturally fall on a hemisphere. We give several proofs of the key random result: that triangles are uniformly distributed when the normal distribution is transferred to the hemisphere. A new proof connects to the distribution of random condition numbers. Generalizing to higher dimensions, we obtain the "square root ellipticity statistic" of random matrix theory. Another proof connects the Hopf map to the SVD of 2 by 2 matrices. A new theorem describes three similar triangles hidden in the hemisphere. Many triangle properties are reformulated as matrix theorems, providing insight to both. This paper argues for a shift of viewpoint to the modern approaches of random matrix theory. As one example, we propose that the smallest singular value is an effective test for uniformity. New software is developed and applications are proposed.

preprint2015arXiv

The Singular Values of the GUE (Less is More)

Some properties that nominally involve the eigenvalues of Gaussian Unitary Ensemble (GUE) can instead be phrased in terms of singular values. By discarding the signs of the eigenvalues, we gain access to a surprising decomposition: the singular values of the GUE are distributed as the union of the singular values of two independent ensembles of Laguerre type. This independence is remarkable given the well known phenomenon of eigenvalue repulsion. The structure of this decomposition reveals that several existing observations about large $n$ limits of the GUE are in fact manifestations of phenomena that are already present for finite random matrices. We relate the semicircle law to the quarter-circle law by connecting Hermite polynomials to generalized Laguerre polynomials with parameter $\pm$1/2. Similarly, we write the absolute value of the determinant of the $n\times{}n$ GUE as a product n independent random variables to gain new insight into its asymptotic log-normality. The decomposition also provides a description of the distribution of the smallest singular value of the GUE, which in turn permits the study of the leading order behavior of the condition number of GUE matrices. The study is motivated by questions involving the enumeration of orientable maps, and is related to questions involving powers of complex Ginibre matrices. The inescapable conclusion of this work is that the singular values of the GUE play an unpredictably important role that had gone unnoticed for decades even though, in hindsight, so many clues had been around.

preprint2014arXiv

Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages

Arrays are such a rich and fundamental data type that they tend to be built into a language, either in the compiler or in a large low-level library. Defining this functionality at the user level instead provides greater flexibility for application domains not envisioned by the language designer. Only a few languages, such as C++ and Haskell, provide the necessary power to define $n$-dimensional arrays, but these systems rely on compile-time abstraction, sacrificing some flexibility. In contrast, dynamic languages make it straightforward for the user to define any behavior they might want, but at the possible expense of performance. As part of the Julia language project, we have developed an approach that yields a novel trade-off between flexibility and compile-time analysis. The core abstraction we use is multiple dispatch. We have come to believe that while multiple dispatch has not been especially popular in most kinds of programming, technical computing is its killer application. By expressing key functions such as array indexing using multi-method signatures, a surprising range of behaviors can be obtained, in a way that is both relatively easy to write and amenable to compiler analysis. The compact factoring of concerns provided by these methods makes it easier for user-defined types to behave consistently with types in the standard library.

preprint2014arXiv

Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof

Polymorphism in programming languages enables code reuse. Here, we show that polymorphism has broad applicability far beyond computations for technical computing: parallelism in distributed computing, presentation of visualizations of runtime data flow, and proofs for formal verification of correctness. The ability to reuse a single codebase for all these purposes provides new ways to understand and verify parallel programs.

preprint2013arXiv

Partial freeness of random matrices

We investigate the implications of free probability for random matrices. From rules for calculating all possible joint moments of two free random matrices, we develop a notion of partial freeness which is quantified by the breakdown of these rules. We provide a combinatorial interpretation for partial freeness as the presence of closed paths in Hilbert space defined by particular joint moments. We also discuss how asymptotic moment expansions provide an error term on the density of states. We present MATLAB code for the calculation of moments and free cumulants of arbitrary random matrices.

preprint2013arXiv

The Beta-MANOVA Ensemble with General Covariance

We find the joint generalized singular value distribution and largest generalized singular value distributions of the $β$-MANOVA ensemble with positive diagonal covariance, which is general. This has been done for the continuous $β> 0$ case for identity covariance (in eigenvalue form), and by setting the covariance to $I$ in our model we get another version. For the diagonal covariance case, it has only been done for $β= 1,2,4$ cases (real, complex, and quaternion matrix entries). This is in a way the first second-order $β$-ensemble, since the sampler for the generalized singular values of the $β$-MANOVA with diagonal covariance calls the sampler for the eigenvalues of the $β$-Wishart with diagonal covariance of Forrester and Dubbs-Edelman-Koev-Venkataramana. We use a conjecture of MacDonald proven by Baker and Forrester concerning an integral of a hypergeometric function and a theorem of Kaneko concerning an integral of Jack polynomials to derive our generalized singular value distributions. In addition we use many identities from Forrester's {\it Log-Gases and Random Matrices}. We supply numerical evidence that our theorems are correct.

preprint2013arXiv

The Beta-Wishart Ensemble

This paper proves a matrix model for the Wishart Ensemble with general covariance and general dimension parameter beta. In so doing, we introduce a new and elegant definition of Jack polynomials.

preprint2012arXiv

Condition Numbers of Indefinite Rank 2 Ghost Wishart Matrices

We define an indefinite Wishart matrix as a matrix of the form A=W^{T}WΣ, where Σis an indefinite diagonal matrix and W is a matrix of independent standard normals. We focus on the case where W is L by 2 which has engineering applications. We obtain the distribution of the ratio of the eigenvalues of A. This distribution can be "folded" to give the distribution of the condition number. We calculate formulas for W real (β=1), complex (β=2), quaternionic (β=4) or any ghost 0<β<\infty. We then corroborate our work by comparing them against numerical experiments.

preprint2012arXiv

Error analysis of free probability approximations to the density of states of disordered systems

Theoretical studies of localization, anomalous diffusion and ergodicity breaking require solving the electronic structure of disordered systems. We use free probability to approximate the ensemble- averaged density of states without exact diagonalization. We present an error analysis that quantifies the accuracy using a generalized moment expansion, allowing us to distinguish between different approximations. We identify an approximation that is accurate to the eighth moment across all noise strengths, and contrast this with the perturbation theory and isotropic entanglement theory.

preprint2012arXiv

Julia: A Fast Dynamic Language for Technical Computing

Dynamic languages have become popular for scientific computing. They are generally considered highly productive, but lacking in performance. This paper presents Julia, a new dynamic language for technical computing, designed for performance from the beginning by adapting and extending modern programming language techniques. A design based on generic functions and a rich type system simultaneously enables an expressive programming model and successful type inference, leading to good performance for a wide range of programs. This makes it possible for much of the Julia library to be written in Julia itself, while also incorporating best-of-breed C and Fortran libraries.

preprint2011arXiv

An Efficient Partitioning Oracle for Bounded-Treewidth Graphs

Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constant-time algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input bounded-degree minor-free graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph. However, the oracle of Hassidimet al. makes an exponential number of queries to the input graph to answer every query about the partition. In this paper, we construct an efficient partitioning oracle for graphs with constant treewidth. The oracle makes only O(poly(1/epsilon)) queries to the input graph to answer each query about the partition. Examples of bounded-treewidth graph classes include k-outerplanar graphs for fixed k, series-parallel graphs, cactus graphs, and pseudoforests. Our oracle yields poly(1/epsilon)-time property testing algorithms for membership in these classes of graphs. Another application of the oracle is a poly(1/epsilon)-time algorithm that approximates the maximum matching size, the minimum vertex cover size, and the minimum dominating set size up to an additive epsilon*n in graphs with bounded treewidth. Finally, the oracle can be used to test in poly(1/epsilon) time whether the input bounded-treewidth graph is k-colorable or perfect.

preprint2004arXiv

Eigenvalues of Hermite and Laguerre ensembles: Large Beta Asymptotics

In this paper we examine the zero and first order eigenvalue fluctuations for the $β$-Hermite and $β$-Laguerre ensembles, using the matrix models we described in \cite{dumitriu02}, in the limit as $β\to \infty$. We find that the fluctuations are described by Gaussians of variance $O(1/β)$, centered at the roots of a corresponding Hermite (Laguerre) polynomial. We also show that the approximation is very good, even for small values of $β$, by plotting exact level densities versus sum of Gaussians approximations.

preprint1995arXiv

How many zeros of a random polynomial are real?

We provide an elementary geometric derivation of the Kac integral formula for the expected number of real zeros of a random polynomial with independent standard normally distributed coefficients. We show that the expected number of real zeros is simply the length of the moment curve $(1,t,\ldots,t^n)$ projected onto the surface of the unit sphere, divided by $π$. The probability density of the real zeros is proportional to how fast this curve is traced out. We then relax Kac's assumptions by considering a variety of random sums, series, and distributions, and we also illustrate such ideas as integral geometry and the Fubini-Study metric.

Alan Edelman

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs

Signal Enhancement for Magnetic Navigation Challenge Problem

AutoMat: Accelerated Computational Electrochemical systems Discovery

Fifty Three Matrix Factorizations: A systematic approach

High-performance symbolic-numerics via multiple dispatch

On the Cartan Decomposition for Classical Random Matrix Ensembles

On the structure of the solutions to the matrix equation $G^*JG=J$

Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia

Beyond universality in random matrix theory

Julia Implementation of the Dynamic Distributed Dimensional Data Model

Integral geometry for Markov chain Monte Carlo: overcoming the curse of search-subspace dimensionality

Julia: A Fresh Approach to Numerical Computing

Random Triangle Theory with Geometry and Applications

The Singular Values of the GUE (Less is More)

Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages

Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof

Partial freeness of random matrices

The Beta-MANOVA Ensemble with General Covariance

The Beta-Wishart Ensemble

Condition Numbers of Indefinite Rank 2 Ghost Wishart Matrices

Error analysis of free probability approximations to the density of states of disordered systems

Julia: A Fast Dynamic Language for Technical Computing

An Efficient Partitioning Oracle for Bounded-Treewidth Graphs

Eigenvalues of Hermite and Laguerre ensembles: Large Beta Asymptotics

How many zeros of a random polynomial are real?