Source author record

Fabrizio Lecci

Fabrizio Lecci appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Geometry math.AT math.ST Statistics Theory Applications Machine Learning Computation Mathematical Software

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

High-Dimensional Longitudinal Classification with the Multinomial Fused Lasso

We study regularized estimation in high-dimensional longitudinal classification problems, using the lasso and fused lasso regularizers. The constructed coefficient estimates are piecewise constant across the time dimension in the longitudinal problem, with adaptively selected change points (break points). We present an efficient algorithm for computing such estimates, based on proximal gradient descent. We apply our proposed technique to a longitudinal data set on Alzheimer's disease from the Cardiovascular Health Study Cognition Study, and use this data set to motivate and demonstrate several practical considerations such as the selection of tuning parameters, and the assessment of model stability.

preprint2015arXiv

Introduction to the R package TDA

We present a short tutorial and introduction to using the R package TDA, which provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel distance. The salient topological features of the sublevel sets (or superlevel sets) of these functions can be quantified with persistent homology. We provide an R interface for the efficient algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function for the persistent homology of the Rips filtration, and one for the persistent homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated over a grid of points. The significance of the features in the resulting persistence diagrams can be analyzed with functions that implement recently developed statistical methods. The R package TDA also includes the implementation of an algorithm for density clustering, which allows us to identify the spatial organization of the probability mass associated to a density function and visualize it by means of a dendrogram, the cluster tree.

preprint2014arXiv

Confidence sets for persistence diagrams

Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." In this paper, we bring some statistical ideas to persistent homology. In particular, we derive confidence sets that allow us to separate topological signal from topological noise.

preprint2014arXiv

On the Bootstrap for Persistence Diagrams and Landscapes

Persistent homology probes topological properties from point clouds and functions. By looking at multiple scales simultaneously, one can record the births and deaths of topological features as the scale varies. In this paper we use a statistical technique, the empirical bootstrap, to separate topological signal from topological noise. In particular, we derive confidence sets for persistence diagrams and confidence bands for persistence landscapes.

preprint2014arXiv

Robust Topological Inference: Distance To a Measure and Kernel Distance

Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topological features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-robust to noise and outliers. Even one outlier is deadly. The distance-to-a-measure (DTM), introduced by Chazal et al. (2011), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topological information but are robust to noise and outliers. Chazal et al. (2014) derived concentration bounds for DTM. Building on these results, we derive limiting distributions and confidence sets, and we propose a method for choosing tuning parameters.

preprint2014arXiv

Statistical Analysis of Metric Graph Reconstruction

A metric graph is a 1-dimensional stratified metric space consisting of vertices and edges or loops glued together. Metric graphs can be naturally used to represent and model data that take the form of noisy filamentary structures, such as street maps, neurons, networks of rivers and galaxies. We consider the statistical problem of reconstructing the topology of a metric graph embedded in R^D from a random sample. We derive lower and upper bounds on the minimax risk for the noiseless case and tubular noise case. The upper bound is based on the reconstruction algorithm given in Aanjaneya et al. (2012).

preprint2014arXiv

Subsampling Methods for Persistent Homology

Persistent homology is a multiscale method for analyzing the shape of sets and functions from point cloud data arising from an unknown distribution supported on those sets. When the size of the sample is large, direct computation of the persistent homology is prohibitive due to the combinatorial nature of the existing algorithms. We propose to compute the persistent homology of several subsamples of the data and then combine the resulting estimates. We study the risk of two estimators and we prove that the subsampling approach carries stable topological information while achieving a great reduction in computational complexity.

preprint2013arXiv

Stochastic Convergence of Persistence Landscapes and Silhouettes

Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory.

Fabrizio Lecci

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

High-Dimensional Longitudinal Classification with the Multinomial Fused Lasso

Introduction to the R package TDA

Confidence sets for persistence diagrams

On the Bootstrap for Persistence Diagrams and Landscapes

Robust Topological Inference: Distance To a Measure and Kernel Distance

Statistical Analysis of Metric Graph Reconstruction

Subsampling Methods for Persistent Homology

Stochastic Convergence of Persistence Landscapes and Silhouettes