Source author record

Mauro Maggioni

Mauro Maggioni appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

20works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Conditional regression for single-index models

The single-index model is a statistical model for intrinsic regression where responses are assumed to depend on a single yet unknown linear combination of the predictors, allowing to express the regression function as $ \mathbb{E} [ Y | X ] = f ( \langle v , X \rangle ) $ for some unknown \emph{index} vector $v$ and \emph{link} function $f$. Conditional methods provide a simple and effective approach to estimate $v$ by averaging moments of $X$ conditioned on $Y$, but depend on parameters whose optimal choice is unknown and do not provide generalization bounds on $f$. In this paper we propose a new conditional method converging at $\sqrt{n}$ rate under an explicit parameter characterization. Moreover, we prove that polynomial partitioning estimates achieve the $1$-dimensional min-max rate for regression of Hölder functions when combined to any $\sqrt{n}$-convergent index estimator. Overall this yields an estimator for dimension reduction and regression of single-index models that attains statistical optimality in quasilinear time.

preprint2022arXiv

Learning Interaction Variables and Kernels from Observations of Agent-Based Systems

Dynamical systems across many disciplines are modeled as interacting particles or agents, with interaction rules that depend on a very small number of variables (e.g. pairwise distances, pairwise differences of phases, etc...), functions of the state of pairs of agents. Yet, these interaction rules can generate self-organized dynamics, with complex emergent behaviors (clustering, flocking, swarming, etc.). We propose a learning technique that, given observations of states and velocities along trajectories of the agents, yields both the variables upon which the interaction kernel depends and the interaction kernel itself, in a nonparametric fashion. This yields an effective dimension reduction which avoids the curse of dimensionality from the high-dimensional observation data (states and velocities of all the agents). We demonstrate the learning capability of our method to a variety of first-order interacting systems.

preprint2022arXiv

Unsupervised learning of observation functions in state-space models by nonparametric moment methods

We investigate the unsupervised learning of non-invertible observation functions in nonlinear state-space models. Assuming abundant data of the observation process along with the distribution of the state process, we introduce a nonparametric generalized moment method to estimate the observation function via constrained regression. The major challenge comes from the non-invertibility of the observation function and the lack of data pairs between the state and observation. We address the fundamental issue of identifiability from quadratic loss functionals and show that the function space of identifiability is the closure of a RKHS that is intrinsic to the state process. Numerical results show that the first two moments and temporal correlations, along with upper and lower bounds, can identify functions ranging from piecewise polynomials to smooth functions, leading to convergent estimators. The limitations of this method, such as non-identifiability due to symmetry and stationarity, are also discussed.

preprint2021arXiv

Anatomically-Informed Deep Learning on Contrast-Enhanced Cardiac MRI for Scar Segmentation and Clinical Feature Extraction

Visualizing disease-induced scarring and fibrosis in the heart on cardiac magnetic resonance (CMR) imaging with contrast enhancement (LGE) is paramount in characterizing disease progression and quantifying pathophysiological substrates of arrhythmias. However, segmentation and scar/fibrosis identification from LGE-CMR is an intensive manual process prone to large inter-observer variability. Here, we present a novel fully-automated anatomically-informed deep learning solution for left ventricle (LV) and scar/fibrosis segmentation and clinical feature extraction from LGE-CMR. The technology involves three cascading convolutional neural networks that segment myocardium and scar/fibrosis from raw LGE-CMR images and constrain these segmentations within anatomical guidelines, thus facilitating seamless derivation of clinically-significant parameters. In addition to available LGE-CMR images, training used "LGE-like" synthetically enhanced cine scans. Results show excellent agreement with those of trained experts in terms of segmentation (balanced accuracy of $96\%$ and $75\%$ for LV and scar segmentation), clinical features ($2\%$ difference in mean scar-to-LV wall volume fraction), and anatomical fidelity. Our segmentation technology is extendable to other computer vision medical applications and to problems requiring guidelines adherence of predicted outputs.

preprint2021arXiv

Learning Interaction Kernels for Agent Systems on Riemannian Manifolds

Interacting agent and particle systems are extensively used to model complex phenomena in science and engineering. We consider the problem of learning interaction kernels in these dynamical systems constrained to evolve on Riemannian manifolds from given trajectory data. The models we consider are based on interaction kernels depending on pairwise Riemannian distances between agents, with agents interacting locally along the direction of the shortest geodesic connecting them. We show that our estimators converge at a rate that is independent of the dimension of the state space, and derive bounds on the trajectory estimation error, on the manifold, between the observed and estimated dynamics. We demonstrate the performance of our estimator on two classical first order interacting systems: Opinion Dynamics and a Predator-Swarm system, with each system constrained on two prototypical manifolds, the $2$-dimensional sphere and the Poincaré disk model of hyperbolic space.

preprint2021arXiv

Multiscale regression on unknown manifolds

We consider the regression problem of estimating functions on $\mathbb{R}^D$ but supported on a $d$-dimensional manifold $ \mathcal{M} \subset \mathbb{R}^D $ with $ d \ll D $. Drawing ideas from multi-resolution analysis and nonlinear approximation, we construct low-dimensional coordinates on $\mathcal{M}$ at multiple scales, and perform multiscale regression by local polynomial fitting. We propose a data-driven wavelet thresholding scheme that automatically adapts to the unknown regularity of the function, allowing for efficient estimation of functions exhibiting nonuniform regularity at different locations and scales. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our estimator attains optimal learning rates (up to logarithmic factors) as if the function was defined on a known Euclidean domain of dimension $d$, instead of an unknown manifold embedded in $\mathbb{R}^D$. The implemented algorithm has quasilinear complexity in the sample size, with constants linear in $D$ and exponential in $d$. Our work therefore establishes a new framework for regression on low-dimensional sets embedded in high dimensions, with fast implementation and strong theoretical guarantees.

preprint2021arXiv

Supervised Dimensionality Reduction for Big Data

To solve key biomedical problems, experimentalists now routinely measure millions or billions of features (dimensions) per sample, with the hope that data science techniques will be able to build accurate data-driven inferences. Because sample sizes are typically orders of magnitude smaller than the dimensionality of these data, valid inferences require finding a low-dimensional representation that preserves the discriminating information (e.g., whether the individual suffers from a particular disease). There is a lack of interpretable supervised dimensionality reduction methods that scale to millions of dimensions with strong statistical theoretical guarantees.We introduce an approach, XOX, to extending principal components analysis by incorporating class-conditional moment estimates into the low-dimensional projection. The simplest ver-sion, "Linear Optimal Low-rank" projection (LOL), incorporates the class-conditional means. We prove, and substantiate with both synthetic and real data benchmarks, that LOL and its generalizations in the XOX framework lead to improved data representations for subsequent classification, while maintaining computational efficiency and scalability. Using multiple brain imaging datasets consisting of >150 million features, and several genomics datasets with>500,000 features, LOL outperforms other scalable linear dimensionality reduction techniques in terms of accuracy, while only requiring a few minutes on a standard desktop computer.

preprint2020arXiv

Data-driven Discovery of Emergent Behaviors in Collective Dynamics

Particle- and agent-based systems are a ubiquitous modeling tool in many disciplines. We consider the fundamental problem of inferring interaction kernels from observations of agent-based dynamical systems given observations of trajectories, in particular for collective dynamical systems exhibiting emergent behaviors with complicated interaction kernels, in a nonparametric fashion, and for kernels which are parametrized by a single unknown parameter. We extend the estimators introduced in \cite{PNASLU}, which are based on suitably regularized least squares estimators, to these larger classes of systems. We provide extensive numerical evidence that the estimators provide faithful approximations to the interaction kernels, and provide accurate predictions for trajectories started at new initial conditions, both throughout the ``training'' time interval in which the observations were made, and often much beyond. We demonstrate these features on prototypical systems displaying collective behaviors, ranging from opinion dynamics, flocking dynamics, self-propelling particle dynamics, synchronized oscillator dynamics, and a gravitational system. Our experiments also suggest that our estimated systems can display the same emergent behaviors of the observed systems, that occur at larger timescales than those used in the training data. Finally, in the case of families of systems governed by a parameterized family of interaction kernels, we introduce novel estimators that estimate the parameterized family of kernels, splitting it into a common interaction kernel and the action of parameters. We demonstrate this in the case of gravity, by learning both the ``common component'' $1/r^2$ and the dependency on mass, without any a priori knowledge of either one, from observations of planetary motions in our solar system.

preprint2020arXiv

Learning interaction kernels in heterogeneous systems of agents from multiple trajectories

Systems of interacting particles or agents have wide applications in many disciplines such as Physics, Chemistry, Biology and Economics. These systems are governed by interaction laws, which are often unknown: estimating them from observation data is a fundamental task that can provide meaningful insights and accurate predictions of the behaviour of the agents. In this paper, we consider the inverse problem of learning interaction laws given data from multiple trajectories, in a nonparametric fashion, when the interaction kernels depend on pairwise distances. We establish a condition for learnability of interaction kernels, and construct estimators that are guaranteed to converge in a suitable $L^2$ space, at the optimal min-max rate for 1-dimensional nonparametric regression. We propose an efficient learning algorithm based on least squares, which can be implemented in parallel for multiple trajectories and is therefore well-suited for the high dimensional, big data regime. Numerical simulations on a variety examples, including opinion dynamics, predator-swarm dynamics and heterogeneous particle dynamics, suggest that the learnability condition is satisfied in models used in practice, and the rate of convergence of our estimator is consistent with the theory. These simulations also suggest that our estimators are robust to noise in the observations, and produce accurate predictions of dynamics in relative large time intervals, even when they are learned from data collected in short time intervals.

preprint2020arXiv

Learning interaction kernels in stochastic systems of interacting particles from multiple trajectories

We consider stochastic systems of interacting particles or agents, with dynamics determined by an interaction kernel which only depends on pairwise distances. We study the problem of inferring this interaction kernel from observations of the positions of the particles, in either continuous or discrete time, along multiple independent trajectories. We introduce a nonparametric inference approach to this inverse problem, based on a regularized maximum likelihood estimator constrained to suitable hypothesis spaces adaptive to data. We show that a coercivity condition enables us to control the condition number of this problem and prove the consistency of our estimator, and that in fact it converges at a near-optimal learning rate, equal to the min-max rate of $1$-dimensional non-parametric regression. In particular, this rate is independent of the dimension of the state space, which is typically very high. We also analyze the discretization errors in the case of discrete-time observations, showing that it is of order $1/2$ in terms of the time gaps between observations. This term, when large, dominates the sampling error and the approximation error, preventing convergence of the estimator. Finally, we exhibit an efficient parallel algorithm to construct the estimator from data, and we demonstrate the effectiveness of our algorithm with numerical tests on prototype systems including stochastic opinion dynamics and a Lennard-Jones model.

preprint2020arXiv

On the identifiability of interaction functions in systems of interacting particles

We address a fundamental issue in the nonparametric inference for systems of interacting particles: the identifiability of the interaction functions. We prove that the interaction functions are identifiable for a class of first-order stochastic systems, including linear systems with general initial laws and nonlinear systems with stationary distributions. We show that a coercivity condition is sufficient for identifiability and becomes necessary when the number of particles approaches infinity. The coercivity is equivalent to the strict positivity of related integral operators, which we prove by showing that their integral kernels are strictly positive definite by using Müntz type theorems.

preprint2019arXiv

Nonparametric inference of interaction laws in systems of agents from trajectory data

Inferring the laws of interaction between particles and agents in complex dynamical systems from observational data is a fundamental challenge in a wide variety of disciplines. We propose a non-parametric statistical learning approach to estimate the governing laws of distance-based interactions, with no reference or assumption about their analytical form, from data consisting trajectories of interacting agents. We demonstrate the effectiveness of our learning approach both by providing theoretical guarantees, and by testing the approach on a variety of prototypical systems in various disciplines. These systems include homogeneous and heterogeneous agents systems, ranging from particle systems in fundamental physics to agent-based systems modeling opinion dynamics under the social influence, prey-predator dynamics, flocking and swarming, and phototaxis in cell dynamics.

preprint2019arXiv

Spectral-Spatial Diffusion Geometry for Hyperspectral Image Clustering

An unsupervised learning algorithm to cluster hyperspectral image (HSI) data is proposed that exploits spatially-regularized random walks. Markov diffusions are defined on the space of HSI spectra with transitions constrained to near spatial neighbors. The explicit incorporation of spatial regularity into the diffusion construction leads to smoother random processes that are more adapted for unsupervised machine learning than those based on spectra alone. The regularized diffusion process is subsequently used to embed the high-dimensional HSI into a lower dimensional space through diffusion distances. Cluster modes are computed using density estimation and diffusion distances, and all other points are labeled according to these modes. The proposed method has low computational complexity and performs competitively against state-of-the-art HSI clustering algorithms on real data. In particular, the proposed spatial regularization confers an empirical advantage over non-regularized methods.

preprint2016arXiv

High Dimensional Data Modeling Techniques for Detection of Chemical Plumes and Anomalies in Hyperspectral Images and Movies

We briefly review recent progress in techniques for modeling and analyzing hyperspectral images and movies, in particular for detecting plumes of both known and unknown chemicals. For detecting chemicals of known spectrum, we extend the technique of using a single subspace for modeling the background to a "mixture of subspaces" model to tackle more complicated background. Furthermore, we use partial least squares regression on a resampled training set to boost performance. For the detection of unknown chemicals we view the problem as an anomaly detection problem, and use novel estimators with low-sampled complexity for intrinsically low-dimensional data in high-dimensions that enable us to model the "normal" spectra and detect anomalies. We apply these algorithms to benchmark data sets made available by the Automated Target Detection program co-funded by NSF, DTRA and NGA, and compare, when applicable, to current state-of-the-art algorithms, with favorable results.

preprint2016arXiv

Inferring Interaction Rules From Observations of Evolutive Systems I: The Variational Approach

In this paper we are concerned with the learnability of nonlocal interaction kernels for first order systems modeling certain social interactions, from observations of realizations of their dynamics. This paper is the first of a series on learnability of nonlocal interaction kernels and presents a variational approach to the problem. In particular, we assume here that the kernel to be learned is bounded and locally Lipschitz continuous and that the initial conditions of the systems are drawn identically and independently at random according to a given initial probability distribution. Then the minimization over a rather arbitrary sequence of (finite dimensional) subspaces of a least square functional measuring the discrepancy from observed trajectories produces uniform approximations to the kernel on compact sets. The convergence result is obtained by combining mean-field limits, transport methods, and a $Γ$-convergence argument. A crucial condition for the learnability is a certain coercivity property of the least square functional, majoring an $L_2$-norm discrepancy to the kernel with respect to a probability measure, depending on the given initial probability distribution by suitable push forwards and transport maps. We illustrate the convergence result by means of several numerical experiments.

preprint2015arXiv

ATLAS: A geometric approach to learning high-dimensional stochastic systems near manifolds

When simulating multiscale stochastic differential equations (SDEs) in high-dimensions, separation of timescales, stochastic noise and high-dimensionality can make simulations prohibitively expensive. The computational cost is dictated by microscale properties and interactions of many variables, while the behavior of interest often occurs at the macroscale level and at large time scales, often characterized by few important, but unknown, degrees of freedom. For many problems bridging the gap between the microscale and macroscale by direct simulation is computationally infeasible. In this work we propose a novel approach to automatically learn a reduced model with an associated fast macroscale simulator. Our unsupervised learning algorithm uses short parallelizable microscale simulations to learn provably accurate macroscale SDE models, which are continuous in space and time. The learning algorithm takes as input: the microscale simulator, a local distance function, and a homogenization spatial or temporal scale, which is the smallest time scale of interest in the reduced system. The learned macroscale model can then be used for fast computation and storage of long simulations. We prove guarantees that related the number of short paths requested from the microscale simulator to the accuracy of the learned macroscale simulator. We discuss various examples, both low- and high-dimensional, as well as results about the accuracy of the fast simulators we construct, and its dependency on the number of short paths requested from the microscale simulator.

preprint2015arXiv

Multiscale Dictionary Learning: Non-Asymptotic Bounds and Robustness

High-dimensional datasets are well-approximated by low-dimensional structures. Over the past decade, this empirical observation motivated the investigation of detection, measurement, and modeling techniques to exploit these low-dimensional intrinsic structures, yielding numerous implications for high-dimensional statistics, machine learning, and signal processing. Manifold learning (where the low-dimensional structure is a manifold) and dictionary learning (where the low-dimensional structure is the set of sparse linear combinations of vectors from a finite dictionary) are two prominent theoretical and computational frameworks in this area. Despite their ostensible distinction, the recently-introduced Geometric Multi-Resolution Analysis (GMRA) provides a robust, computationally efficient, multiscale procedure for simultaneously learning manifolds and dictionaries. In this work, we prove non-asymptotic probabilistic bounds on the approximation error of GMRA for a rich class of data-generating statistical models that includes "noisy" manifolds, thereby establishing the theoretical robustness of the procedure and confirming empirical observations. In particular, if a dataset aggregates near a low-dimensional manifold, our results show that the approximation error of the GMRA is completely independent of the ambient dimension. Our work therefore establishes GMRA as a provably fast algorithm for dictionary learning with approximation and sparsity guarantees. We include several numerical experiments confirming these theoretical results, and our theoretical framework provides new tools for assessing the behavior of manifold learning and dictionary learning procedures on a large class of interesting models.

preprint2012arXiv

Approximation of Points on Low-Dimensional Manifolds Via Random Linear Projections

This paper considers the approximate reconstruction of points, x \in R^D, which are close to a given compact d-dimensional submanifold, M, of R^D using a small number of linear measurements of x. In particular, it is shown that a number of measurements of x which is independent of the extrinsic dimension D suffices for highly accurate reconstruction of a given x with high probability. Furthermore, it is also proven that all vectors, x, which are sufficiently close to M can be reconstructed with uniform approximation guarantees when the number of linear measurements of x depends logarithmically on D. Finally, the proofs of these facts are constructive: A practical algorithm for manifold-based signal recovery is presented in the process of proving the two main results mentioned above.

preprint2012arXiv

Multiscale Markov Decision Problems: Compression, Solution, and Transfer Learning

Many problems in sequential decision making and stochastic control often have natural multiscale structure: sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure, particularly beyond a single level of abstraction, has remained a longstanding challenge. We describe a fast multiscale procedure for repeatedly compressing, or homogenizing, Markov decision processes (MDPs), wherein a hierarchy of sub-problems at different scales is automatically determined. Coarsened MDPs are themselves independent, deterministic MDPs, and may be solved using existing algorithms. The multiscale representation delivered by this procedure decouples sub-tasks from each other and can lead to substantial improvements in convergence rates both locally within sub-problems and globally across sub-problems, yielding significant computational savings. A second fundamental aspect of this work is that these multiscale decompositions yield new transfer opportunities across different problems, where solutions of sub-tasks at different levels of the hierarchy may be amenable to transfer to new problems. Localized transfer of policies and potential operators at arbitrary scales is emphasized. Finally, we demonstrate compression and transfer in a collection of illustrative domains, including examples involving discrete and continuous statespaces.

preprint2011arXiv

Multiscale Geometric Methods for Data Sets II: Geometric Multi-Resolution Analysis

Data sets are often modeled as point clouds in $R^D$, for $D$ large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a $d$-dimensional manifold $M$, with $d$ much smaller than $D$. When $M$ is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of $d$ vectors in $R^D$ (for example found by SVD), at a cost $(n+D)d$ for $n$ data points. When $M$ is nonlinear, there are no "explicit" constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box optimization. In this paper we construct data-dependent multi-scale dictionaries that aim at efficient encoding and manipulating of the data. Their construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa. In addition, data points are guaranteed to have a sparse representation in terms of the dictionary. We think of dictionaries as the analogue of wavelets, but for approximating point clouds rather than functions.

Mauro Maggioni

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Conditional regression for single-index models

Learning Interaction Variables and Kernels from Observations of Agent-Based Systems

Unsupervised learning of observation functions in state-space models by nonparametric moment methods

Anatomically-Informed Deep Learning on Contrast-Enhanced Cardiac MRI for Scar Segmentation and Clinical Feature Extraction

Learning Interaction Kernels for Agent Systems on Riemannian Manifolds

Multiscale regression on unknown manifolds

Supervised Dimensionality Reduction for Big Data

Data-driven Discovery of Emergent Behaviors in Collective Dynamics

Learning interaction kernels in heterogeneous systems of agents from multiple trajectories

Learning interaction kernels in stochastic systems of interacting particles from multiple trajectories

On the identifiability of interaction functions in systems of interacting particles

Nonparametric inference of interaction laws in systems of agents from trajectory data

Spectral-Spatial Diffusion Geometry for Hyperspectral Image Clustering

High Dimensional Data Modeling Techniques for Detection of Chemical Plumes and Anomalies in Hyperspectral Images and Movies

Inferring Interaction Rules From Observations of Evolutive Systems I: The Variational Approach

ATLAS: A geometric approach to learning high-dimensional stochastic systems near manifolds

Multiscale Dictionary Learning: Non-Asymptotic Bounds and Robustness

Approximation of Points on Low-Dimensional Manifolds Via Random Linear Projections

Multiscale Markov Decision Problems: Compression, Solution, and Transfer Learning

Multiscale Geometric Methods for Data Sets II: Geometric Multi-Resolution Analysis