Source author record

Ryan Murray

Ryan Murray appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AP Machine Learning math.ST Statistics Theory math.OC Computational Geometry cond-mat.mtrl-sci Cryptography and Security math.SP Multiagent Systems

Catalog footprint

What is connected

11works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Adversarial Classification: Necessary conditions and geometric flows

We study a version of adversarial classification where an adversary is empowered to corrupt data inputs up to some distance $\varepsilon$, using tools from variational analysis. In particular, we describe necessary conditions associated with the optimal classifier subject to such an adversary. Using the necessary conditions, we derive a geometric evolution equation which can be used to track the change in classification boundaries as $\varepsilon$ varies. This evolution equation may be described as an uncoupled system of differential equations in one dimension, or as a mean curvature type equation in higher dimension. In one dimension, and under mild assumptions on the data distribution, we rigorously prove that one can use the initial value problem starting from $\varepsilon=0$, which is simply the Bayes classifier, in order to solve for the global minimizer of the adversarial problem for small values of $\varepsilon$. In higher dimensions we provide a similar result, albeit conditional to the existence of regular solutions of the initial value problem. In the process of proving our main results we obtain a result of independent interest connecting the original adversarial problem with an optimal transport problem under no assumptions on whether classes are balanced or not. Numerical examples illustrating these ideas are also presented.

preprint2022arXiv

Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima

In centralized settings, it is well known that stochastic gradient descent (SGD) avoids saddle points and converges to local minima in nonconvex problems. However, similar guarantees are lacking for distributed first-order algorithms. The paper studies distributed stochastic gradient descent (D-SGD)--a simple network-based implementation of SGD. Conditions under which D-SGD avoids saddle points and converges to local minima are studied. First, we consider the problem of computing critical points. Assuming loss functions are nonconvex and possibly nonsmooth, it is shown that, for each fixed initialization, D-SGD converges to critical points of the loss with probability one. Next, we consider the problem of avoiding saddle points. In this case, we again assume that loss functions may be nonconvex and nonsmooth, but are smooth in a neighborhood of a saddle point. It is shown that, for any fixed initialization, D-SGD avoids such saddle points with probability one. Results are proved by studying the underlying (distributed) gradient flow, using the ordinary differential equation (ODE) method of stochastic approximation, and extending classical techniques from dynamical systems theory such as stable manifolds. Results are proved in the general context of subspace-constrained optimization, of which D-SGD is a special case.

preprint2022arXiv

Eikonal depth: an optimal control approach to statistical depths

Statistical depths provide a fundamental generalization of quantiles and medians to data in higher dimensions. This paper proposes a new type of globally defined statistical depth, based upon control theory and eikonal equations, which measures the smallest amount of probability density that has to be passed through in a path to points outside the support of the distribution: for example spatial infinity. This depth is easy to interpret and compute, expressively captures multi-modal behavior, and extends naturally to data that is non-Euclidean. We prove various properties of this depth, and provide discussion of computational considerations. In particular, we demonstrate that this notion of depth is robust under an aproximate isometrically constrained adversarial model, a property which is not enjoyed by the Tukey depth. Finally we give some illustrative examples in the context of two-dimensional mixture models and MNIST.

preprint2022arXiv

From graph cuts to isoperimetric inequalities: Convergence rates of Cheeger cuts on data clouds

In this work we study statistical properties of graph-based clustering algorithms that rely on the optimization of balanced graph cuts, the main example being the optimization of Cheeger cuts. We consider proximity graphs built from data sampled from an underlying distribution supported on a generic smooth compact manifold $M$. In this setting, we obtain high probability convergence rates for both the Cheeger constant and the associated Cheeger cuts towards their continuum counterparts. The key technical tools are careful estimates of interpolation operators which lift empirical Cheeger cuts to the continuum, as well as continuum stability estimates for isoperimetric problems. To our knowledge the quantitative estimates obtained here are the first of their kind.

preprint2020arXiv

A maximum principle argument for the uniform convergence of graph Laplacian regressors

This paper investigates the use of methods from partial differential equations and the Calculus of variations to study learning problems that are regularized using graph Laplacians. Graph Laplacians are a powerful, flexible method for capturing local and global geometry in many classes of learning problems, and the techniques developed in this paper help to broaden the methodology of studying such problems. In particular, we develop the use of maximum principle arguments to establish asymptotic consistency guarantees within the context of noise corrupted, non-parametric regression with samples living on an unknown manifold embedded in $\mathbb{R}^d$. The maximum principle arguments provide a new technical tool which informs parameter selection by giving concrete error estimates in terms of various regularization parameters. A review of learning algorithms which utilize graph Laplacians, as well as previous developments in the use of differential equation and variational techniques to study those algorithms, is given. In addition, new connections are drawn between Laplacian methods and other machine learning techniques, such as kernel regression and k-nearest neighbor methods.

preprint2020arXiv

Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point Evasion

The paper considers distributed gradient flow (DGF) for multi-agent nonconvex optimization. DGF is a continuous-time approximation of distributed gradient descent that is often easier to study than its discrete-time counterpart. The paper has two main contributions. First, the paper considers optimization of nonsmooth, nonconvex objective functions. It is shown that DGF converges to critical points in this setting. The paper then considers the problem of avoiding saddle points. It is shown that if agents' objective functions are assumed to be smooth and nonconvex, then DGF can only converge to a saddle point from a zero-measure set of initial conditions. To establish this result, the paper proves a stable manifold theorem for DGF, which is a fundamental contribution of independent interest. In a companion paper, analogous results are derived for discrete-time algorithms.

preprint2016arXiv

A new analytical approach to consistency and overfitting in regularized empirical risk minimization

This work considers the problem of binary classification: given training data $x_1, \dots, x_n$ from a certain population, together with associated labels $y_1,\dots, y_n \in \left\{0,1 \right\}$, determine the best label for an element $x$ not among the training data. More specifically, this work considers a variant of the regularized empirical risk functional which is defined intrinsically to the observed data and does not depend on the underlying population. Tools from modern analysis are used to obtain a concise proof of asymptotic consistency as regularization parameters are taken to zero at rates related to the size of the sample. These analytical tools give a new framework for understanding overfitting and underfitting, and rigorously connect the notion of overfitting with a loss of compactness.

preprint2016arXiv

Cutoff estimates for the Becker-Döring equations

This paper continues the authors' previous study (SIAM J. Math. Anal., 2016) of the trend toward equilibrium of the Becker-Döring equations with subcritical mass, by characterizing certain fine properties of solutions to the linearized equation. In particular, we partially characterize the spectrum of the linearized operator, showing that it contains the entire imaginary axis in polynomially weighted spaces. Moreover, we prove detailed cutoff estimates that establish upper and lower bounds on the lifetime of a class of perturbations to equilibrium.

preprint2016arXiv

Trapping of solute atoms at grain boundaries in GdNi2

Lattice locations of 111In impurity probe atoms in intermetallic GdNi2 were studied as a function of alloy composition and temperature using perturbed angular correlation spectroscopy (PAC). Three nuclear quadrupole interaction signals were detected and their equilibrium site fractions were measured up to 700 oC. Two signals have well-defined electric field gradients (EFGs) and are attributed to In-probes on Gd- and Ni-sites in a well-ordered lattice. A third, inhomogeneously broadened signal was observed at low temperature. This is attributed to trapping, or segregation, of In-probes to lattice sinks such as grain boundaries (GB) that have a large multiplicity of local environments and EFGs. Changes in site fractions were reversible above 300oC. Measurements were made on a pair of samples that were richer and poorer in Gd. Remarkably, the GB-site was populated only in the more Gd-rich sample. This is explained by the hypothesis that excess Gd segregates to the grain boundaries and provides a lower enthalpy environment for In-probe atoms. Observations are discussed in relation to a three-level quantum system. Enthalpy differences between levels were determined from measurements of temperature dependences of ratios of site fractions. The enthalpy of transfer of In-probes from the Gd- to Ni-sublattice was found to be much smaller in the Gd-rich sample. This is attributed to a large temperature-dependence in the degeneracies of levels available to In-solutes in the phase, leading to an effective transfer enthalpy that differs greatly from the difference in site-enthalpies. A possible scenario is discussed. Different segregation enthalpies were measured for In-solute transferring from GB sites to Gd- and Ni-sites, whereas only an average value can be determined through macroscopic measurements.

preprint2015arXiv

Second-Order $Γ$-limit for the Cahn-Hilliard Functional

The goal of this paper is to solve a long standing open problem, namely, the asymptotic development of order $2$ by $Γ$-convergence of the mass-constrained Cahn-Hilliard functional. This is achieved by introducing a novel rearrangement technique, which works without Dirichlet boundary conditions.

preprint2015arXiv

Slow motion for the nonlocal Allen-Cahn equation in n-dimensions

The goal of this paper is to study the slow motion of solutions of the nonlocal Allen-Cahn equation in a bounded domain $Ω\subset \mathbb{R}^n$, for $n > 1$. The initial data is assumed to be close to a configuration whose interface separating the states minimizes the surface area (or perimeter); both local and global perimeter minimizers are taken into account. The evolution of interfaces on a time scale $\varepsilon^{-1}$ is deduced, where $\varepsilon$ is the interaction length parameter. The key tool is a second-order $Γ$-convergence analysis of the energy functional, which provides sharp energy estimates. New regularity results are derived for the isoperimetric function of a domain. Slow motion of solutions for the Cahn-Hilliard equation starting close to global perimeter minimizers is proved as well.

Ryan Murray

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Adversarial Classification: Necessary conditions and geometric flows

Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima

Eikonal depth: an optimal control approach to statistical depths

From graph cuts to isoperimetric inequalities: Convergence rates of Cheeger cuts on data clouds

A maximum principle argument for the uniform convergence of graph Laplacian regressors

Distributed Gradient Flow: Nonsmoothness, Nonconvexity, and Saddle Point Evasion

A new analytical approach to consistency and overfitting in regularized empirical risk minimization

Cutoff estimates for the Becker-Döring equations

Trapping of solute atoms at grain boundaries in GdNi2

Second-Order $Γ$-limit for the Cahn-Hilliard Functional

Slow motion for the nonlocal Allen-Cahn equation in n-dimensions