Researcher profile

Sean Skwerer

Sean Skwerer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2016arXiv

Computing Affine Combinations, Distances, and Correlations for Recursive Partition Functions

Recursive partitioning is the core of several statistical methods including CART, random forest, and boosted trees. Despite the popularity of tree based methods, to date, there did not exist methods for combining multiple trees into a single tree, or methods for systematically quantifying the discrepancy between two trees. Taking advantage of the recursive structure in trees we formulated fast algorithms for computing affine combinations, distances and correlations in a vector subspace of recursive partition functions.

preprint2015arXiv

Dynamic Geodesics in Treespace via Parametric Maximum Flow

Shortest paths in treespace, which represent minimal deformations between trees, are unique and can be computed in polynomial time. The ability to quickly compute shortest paths has enabled new approaches for statistical analysis of populations of trees and phylogenetic inference. This paper gives a new algorithm for updating geodesic paths when the end points are dynamic. Such algorithms will be especially useful when optimizing for objectives that are functions of distances from a search point to other points e.g. for finding a tree which has the minimum average distance to a collection of trees. Our method for updating treespace shortest paths is based on parametric sensitivity analysis of the maximum flow subproblems that are optimized when solving for a treespace geodesic.

preprint2014arXiv

Persistent homology analysis of brain artery trees

New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. The correlation with age continues to be significant even after controlling for correlations from earlier significant summaries

preprint2014arXiv

Tree Oriented Data Analysis

Complex data objects arise in many areas of modern science including evolutionary biology, nueroscience, dynamics of gene expression and medical imaging. Object oriented data analysis (OODA) is the statistical analysis of datasets of complex objects. Data analysis of tree data objects is an exciting research area with interesting questions and challenging problems. This thesis focuses on tree oriented statistical methodologies, and algorithms for solving related mathematical optimization problems. This research is motivated by the goal of analyzing a data set of images of human brain arteries. The approach we take here is to use a novel representation of brain artery systems as points in phylogenetic treespace. The treespace property of unique global geodesics leads to a notion of geometric center called a Fréchet mean. For a sample of data points, the Fréchet function is the sum of squared distances from a point to the data points, and the Fréchet mean is the minimizer of the Fréchet function. In this thesis we use properties of the Fréchet function to develop an algorithmic system for computing Fréchet means. Properties of the Fréchet function are also used to show a sticky law of large numbers which describes a surprising stability of the topological tree structure of sample Fréchet means at that of the population Fréchet mean. We also introduce non-parametric regression of brain artery tree structure as a response variable to age based on weighted Fréchet means.

preprint2013arXiv

Sticky central limit theorems on open books

Given a probability distribution on an open book (a metric space obtained by gluing a disjoint union of copies of a half-space along their boundary hyperplanes), we define a precise concept of when the Fréchet mean (barycenter) is sticky. This nonclassical phenomenon is quantified by a law of large numbers (LLN) stating that the empirical mean eventually almost surely lies on the (codimension $1$ and hence measure $0$) spine that is the glued hyperplane, and a central limit theorem (CLT) stating that the limiting distribution is Gaussian and supported on the spine. We also state versions of the LLN and CLT for the cases where the mean is nonsticky (i.e., not lying on the spine) and partly sticky (i.e., is, on the spine but not sticky).