Source author record

Alexander Cloninger

Alexander Cloninger appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.CA math.ST Statistics Theory Information Theory math.CO math.IT math.NA math.NT math.OC math.PR Numerical Analysis physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

12works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Optimal Transport, Timesteppers, Newton-Krylov Methods and Steady States of Collective Particle Dynamics

Timesteppers constitute a powerful tool in modern computational science and engineering. Although they are typically used to advance the system forward in time, they can also be viewed as nonlinear mappings that implicitly encode steady states and stability information. In this work, we present an extension of the matrix-free framework for calculating, via timesteppers, steady states of deterministic systems to stochastic particle simulations, where intrinsic randomness prevents direct steady state extraction. By formulating stochastic timesteppers in the language of optimal transport, we reinterpret them as operators acting on probability measures rather than on individual particle trajectories. This perspective enables the construction of smooth cumulative- and inverse-cumulative-distribution-function ((I)CDF) timesteppers that evolve distributions rather than particles. Combined with matrix-free Newton-Krylov solvers, these smooth timesteppers allow efficient computation of steady-state distributions even under high stochastic noise. We perform an error analysis quantifying how noise affects finite-difference Jacobian action approximations, and demonstrate that convergence can be obtained even in high noise regimes. Finally, we introduce higher-dimensional generalizations based on smooth CDF-related representations of particles and validate their performance on a non-trivial two-dimensional distribution. Together, these developments establish a unified variational framework for computing meaningful steady states of both deterministic and stochastic timesteppers.

preprint2024arXiv

Point Cloud Classification via Deep Set Linearized Optimal Transport

We introduce Deep Set Linearized Optimal Transport, an algorithm designed for the efficient simultaneous embedding of point clouds into an $L^2-$space. This embedding preserves specific low-dimensional structures within the Wasserstein space while constructing a classifier to distinguish between various classes of point clouds. Our approach is motivated by the observation that $L^2-$distances between optimal transport maps for distinct point clouds, originating from a shared fixed reference distribution, provide an approximation of the Wasserstein-2 distance between these point clouds, under certain assumptions. To learn approximations of these transport maps, we employ input convex neural networks (ICNNs) and establish that, under specific conditions, Euclidean distances between samples from these ICNNs closely mirror Wasserstein-2 distances between the true distributions. Additionally, we train a discriminator network that attaches weights these samples and creates a permutation invariant classifier to differentiate between different classes of point clouds. We showcase the advantages of our algorithm over the standard deep set approach through experiments on a flow cytometry dataset with a limited number of labeled point clouds.

preprint2022arXiv

Sigma-Delta and Distributed Noise-Shaping Quantization Methods for Random Fourier Features

We propose the use of low bit-depth Sigma-Delta and distributed noise-shaping methods for quantizing the Random Fourier features (RFFs) associated with shift-invariant kernels. We prove that our quantized RFFs -- even in the case of $1$-bit quantization -- allow a high accuracy approximation of the underlying kernels, and the approximation error decays at least polynomially fast as the dimension of the RFFs increases. We also show that the quantized RFFs can be further compressed, yielding an excellent trade-off between memory use and accuracy. Namely, the approximation error now decays exponentially as a function of the bits used. Moreover, we empirically show by testing the performance of our methods on several machine learning tasks that our method compares favorably to other state of the art quantization methods in this context.

preprint2022arXiv

Supervised learning of sheared distributions using linearized optimal transport

In this paper we study supervised learning tasks on the space of probability measures. We approach this problem by embedding the space of probability measures into $L^2$ spaces using the optimal transport framework. In the embedding spaces, regular machine learning techniques are used to achieve linear separability. This idea has proved successful in applications and when the classes to be separated are generated by shifts and scalings of a fixed measure. This paper extends the class of elementary transformations suitable for the framework to families of shearings, describing conditions under which two classes of sheared distributions can be linearly separated. We furthermore give necessary bounds on the transformations to achieve a pre-specified separation level, and show how multiple embeddings can be used to allow for larger families of transformations. We demonstrate our results on image classification tasks.

preprint2020arXiv

Classification Logit Two-sample Testing by Neural Networks

The recent success of generative adversarial networks and variational learning suggests training a classifier network may work well in addressing the classical two-sample problem. Network-based tests have the computational advantage that the algorithm scales to large samples. This paper proposes a two-sample statistic which is the difference of the logit function, provided by a trained classification neural network, evaluated on the testing set split of the two datasets. Theoretically, we prove the testing power to differentiate two sub-exponential densities given that the network is sufficiently parametrized. When the two densities lie on or near to low-dimensional manifolds embedded in possibly high-dimensional space, the needed network complexity is reduced to only scale with the intrinsic dimensionality. Both the approximation and estimation error analysis are based on a new result of near-manifold integral approximation. In experiments, the proposed method demonstrates better performance than previous network-based tests using classification accuracy as the two-sample statistic, and compares favorably to certain kernel maximum mean discrepancy tests on synthetic datasets and hand-written digit datasets.

preprint2020arXiv

Coresets for Estimating Means and Mean Square Error with Limited Greedy Samples

In a number of situations, collecting a function value for every data point may be prohibitively expensive, and random sampling ignores any structure in the underlying data. We introduce a scalable optimization algorithm with no correction steps (in contrast to Frank-Wolfe and its variants), a variant of gradient ascent for coreset selection in graphs, that greedily selects a weighted subset of vertices that are deemed most important to sample. Our algorithm estimates the mean of the function by taking a weighted sum only at these vertices, and we provably bound the estimation error in terms of the location and weights of the selected vertices in the graph. In addition, we consider the case where nodes have different selection costs and provide bounds on the quality of the low-cost selected coresets. We demonstrate the benefits of our algorithm on the semi-supervised node classification of graph convolutional neural network, point clouds and structured graphs, as well as sensor placement where the cost of placing sensors depends on the location of the placement. We also elucidate that the empirical convergence of our proposed method is faster than random selection and various clustering methods while still respecting sensor placement cost. The paper concludes with validation of the developed algorithm on both synthetic and real datasets, demonstrating that it outperforms the current state of the art.

preprint2020arXiv

Variational Diffusion Autoencoders with Random Walk Sampling

Variational autoencoders (VAEs) and generative adversarial networks (GANs) enjoy an intuitive connection to manifold learning: in training the decoder/generator is optimized to approximate a homeomorphism between the data distribution and the sampling space. This is a construction that strives to define the data manifold. A major obstacle to VAEs and GANs, however, is choosing a suitable prior that matches the data topology. Well-known consequences of poorly picked priors are posterior and mode collapse. To our knowledge, no existing method sidesteps this user choice. Conversely, $\textit{diffusion maps}$ automatically infer the data topology and enjoy a rigorous connection to manifold learning, but do not scale easily or provide the inverse homeomorphism (i.e. decoder/generator). We propose a method that combines these approaches into a generative model that inherits the asymptotic guarantees of $\textit{diffusion maps}$ while preserving the scalability of deep models. We prove approximation theoretic results for the dimension dependence of our proposed method. Finally, we demonstrate the effectiveness of our method with various real and synthetic datasets.

preprint2016arXiv

A Note on Markov Normalized Magnetic Eigenmaps

We note that building a magnetic Laplacian from the Markov transition matrix, rather than the graph adjacency matrix, yields several benefits for the magnetic eigenmaps algorithm. The two largest benefits are that the embedding becomes more stable as a function of the rotation parameter g, and the principal eigenvector of the magnetic Laplacian now converges to the page rank of the network as a function of diffusion time. We show empirically that this normalization improves the phase and real/imaginary embeddings of the low-frequency eigenvectors of the magnetic Laplacian.

preprint2016arXiv

On Suprema of Autoconvolutions with an Application to Sidon sets

Let $f$ be a nonnegative function supported on $(-1/4, 1/4)$. We show $$ \sup_{x \in \mathbb{R}}{\int_{\mathbb{R}}{f(t)f(x-t)dt}} \geq 1.28\left(\int_{-1/4}^{1/4}{f(x)dx} \right)^2,$$ where 1.28 improves on a series of earlier results. The inequality arises naturally in additive combinatorics in the study of Sidon sets. We derive a relaxation of the problem that reduces to a finite number of cases and yields slightly stronger results. Our approach should be able to prove lower bounds that are arbitrary close to the sharp result. Currently, the bottleneck in our approach is runtime: new ideas might be able to significantly speed up the computation.

preprint2016arXiv

Spectral Echolocation via the Wave Embedding

Spectral embedding uses eigenfunctions of the discrete Laplacian on a weighted graph to obtain coordinates for an embedding of an abstract data set into Euclidean space. We propose a new pre-processing step of first using the eigenfunctions to simulate a low-frequency wave moving over the data and using both position as well as change in time of the wave to obtain a refined metric to which classical methods of dimensionality reduction can then applied. This is motivated by the behavior of waves, symmetries of the wave equation and the hunting technique of bats. It is shown to be effective in practice and also works for other partial differential equations -- the method yields improved results even for the classical heat equation.

preprint2015arXiv

Bigeometric Organization of Deep Nets

In this paper, we build an organization of high-dimensional datasets that cannot be cleanly embedded into a low-dimensional representation due to missing entries and a subset of the features being irrelevant to modeling functions of interest. Our algorithm begins by defining coarse neighborhoods of the points and defining an expected empirical function value on these neighborhoods. We then generate new non-linear features with deep net representations tuned to model the approximate function, and re-organize the geometry of the points with respect to the new representation. Finally, the points are locally z-scored to create an intrinsic geometric organization which is independent of the parameters of the deep net, a geometry designed to assure smoothness with respect to the empirical function. We examine this approach on data from the Center for Medicare and Medicaid Services Hospital Quality Initiative, and generate an intrinsic low-dimensional organization of the hospitals that is smooth with respect to an expert driven function of quality.

preprint2015arXiv

Diffusion Nets

Non-linear manifold learning enables high-dimensional data analysis, but requires out-of-sample-extension methods to process new data points. In this paper, we propose a manifold learning algorithm based on deep learning to create an encoder, which maps a high-dimensional dataset and its low-dimensional embedding, and a decoder, which takes the embedded data back to the high-dimensional space. Stacking the encoder and decoder together constructs an autoencoder, which we term a diffusion net, that performs out-of-sample-extension as well as outlier detection. We introduce new neural net constraints for the encoder, which preserves the local geometry of the points, and we prove rates of convergence for the encoder. Also, our approach is efficient in both computational complexity and memory requirements, as opposed to previous methods that require storage of all training points in both the high-dimensional and the low-dimensional spaces to calculate the out-of-sample-extension and the pre-image.

Alexander Cloninger

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Optimal Transport, Timesteppers, Newton-Krylov Methods and Steady States of Collective Particle Dynamics

Point Cloud Classification via Deep Set Linearized Optimal Transport

Sigma-Delta and Distributed Noise-Shaping Quantization Methods for Random Fourier Features

Supervised learning of sheared distributions using linearized optimal transport

Classification Logit Two-sample Testing by Neural Networks

Coresets for Estimating Means and Mean Square Error with Limited Greedy Samples

Variational Diffusion Autoencoders with Random Walk Sampling

A Note on Markov Normalized Magnetic Eigenmaps

On Suprema of Autoconvolutions with an Application to Sidon sets

Spectral Echolocation via the Wave Embedding

Bigeometric Organization of Deep Nets

Diffusion Nets