Source author record

Vadim Zipunnikov

Vadim Zipunnikov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology Computation math.ST Neurons and Cognition Statistics Theory

Catalog footprint

What is connected

8works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Scalar-on-distribution regression via generalized odds with applications to accelerometry-assessed disability in multiple sclerosis

Distributional representations of data collected using digital health technologies have been shown to outperform scalar summaries for clinical prediction, with carefully quantified tail-behavior often driving the gains. Motivated by these findings, we propose a unified generalized odds (GO) framework that represents subject-specific distributions through ratios of probabilities over arbitrary regions of the sample space, subsuming hazard, survival, and residual life representations as special cases. We develop a scale-on-odds regression model using spline-based functional representations with penalization for efficient estimation. Applied to wrist-worn accelerometry data from the HEAL-MS study, generalized odds models yield improved prediction of Expanded Disability Status Scale (EDSS) scores compared to classical scalar and survival-based approaches, demonstrating the value of odds-based distributional covariates for modeling DHT data.

preprint2022arXiv

Shape-constrained Estimation in Functional Regression with Bernstein Polynomials

Shape restrictions on functional regression coefficients such as non-negativity, monotonicity, convexity or concavity are often available in the form of a prior knowledge or required to maintain a structural consistency in functional regression models. A new estimation method is developed in shape-constrained functional regression models using Bernstein polynomials. Specifically, estimation approaches from nonparametric regression are extended to functional data, properly accounting for shape-constraints in a large class of functional regression models such as scalar-on-function regression (SOFR), function-on-scalar regression (FOSR), and function-on-function regression (FOFR). Theoretical results establish the asymptotic consistency of the constrained estimators under standard regularity conditions. A projection based approach provides point-wise asymptotic confidence intervals for the constrained estimators. A bootstrap test is developed facilitating testing of the shape constraints. Numerical analysis using simulations illustrate improvement in efficiency of the estimators from the use of the proposed method under shape constraints. Two applications include i) modeling a drug effect in a mental health study via shape-restricted FOSR and ii) modeling subject-specific quantile functions of accelerometry-estimated physical activity in the Baltimore Longitudinal Study of Aging (BLSA) as outcomes via shape-restricted quantile-function on scalar regression (QFOSR). R software implementation and illustration of the proposed estimation method and the test is provided.

preprint2016arXiv

Stride variability measures derived from wrist- and hip-worn accelerometers

Many epidemiological and clinical studies use accelerometry to objectively measure physical activity using the activity counts, vector magnitude, or number of steps. These measures use just a fraction of the information in the raw accelerometry data as they are typically summarized at the minute level. To address this problem we define and estimate two gait measures of temporal stride-to-stride variability based on raw accelerometry data: Amplitude Deviation (AD) and Phase Deviation (PD). We explore the sensitivity of our approach to on-body placement of the accelerometer by comparing hip, left and right wrist placements. We illustrate the approach by estimating AD and PD in 46 elderly participants in the Developmental Epidemiologic Cohort Study (DECOS) who worn accelerometers during a 400 meter walk test. We also show that AD and PD have a statistically significant association with the gait speed and sit-to-stand test performance

preprint2015arXiv

Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis

We develop a flexible framework for modeling high-dimensional imaging data observed longitudinally. The approach decomposes the observed variability of repeatedly measured high-dimensional observations into three additive components: a subject-specific imaging random intercept that quantifies the cross-sectional variability, a subject-specific imaging slope that quantifies the dynamic irreversible deformation over multiple realizations, and a subject-visit-specific imaging deviation that quantifies exchangeable effects between visits. The proposed method is very fast, scalable to studies including ultrahigh-dimensional data, and can easily be adapted to and executed on modest computing infrastructures. The method is applied to the longitudinal analysis of diffusion tensor imaging (DTI) data of the corpus callosum of multiple sclerosis (MS) subjects. The study includes $176$ subjects observed at $466$ visits. For each subject and visit the study contains a registered DTI scan of the corpus callosum at roughly 30,000 voxels.

preprint2014arXiv

Fast Covariance Estimation for High-dimensional Functional Data

For smoothing covariance functions, we propose two fast algorithms that scale linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension $J \times J$ with $J>500$; the recently introduced sandwich smoother is an exception, but it is not adapted to smooth covariance matrices of large dimensions such as $J \ge 10,000$. Covariance matrices of order $J=10,000$, and even $J=100,000$, are becoming increasingly common, e.g., in 2- and 3-dimensional medical imaging and high-density wearable sensor data. We introduce two new algorithms that can handle very large covariance matrices: 1) FACE: a fast implementation of the sandwich smoother and 2) SVDS: a two-step procedure that first applies singular value decomposition to the data matrix and then smoothes the eigenvectors. Compared to existing techniques, these new algorithms are at least an order of magnitude faster in high dimensions and drastically reduce memory requirements. The new algorithms provide instantaneous (few seconds) smoothing for matrices of dimension $J=10,000$ and very fast ($<$ 10 minutes) smoothing for $J=100,000$. Although SVDS is simpler than FACE, we provide ready to use, scalable R software for FACE. When incorporated into R package {\it refund}, FACE improves the speed of penalized functional regression by an order of magnitude, even for data of normal size ($J <500$). We recommend that FACE be used in practice for the analysis of noisy and high-dimensional functional data.

preprint2014arXiv

Fast, Exact Bootstrap Principal Component Analysis for p>1 million

Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject ($p$) is much larger than the number of subjects ($n$), the challenge of calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same $n$-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same $n$-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the $p$-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram (EEG) recordings ($p=900$, $n=392$), and to a dataset of brain magnetic resonance images (MRIs) ($p\approx$ 3 million, $n=352$). For the brain MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods.

preprint2013arXiv

Parametrization of white matter manifold-like structures using principal surfaces

In this manuscript, we are concerned with data generated from a diffusion tensor imaging (DTI) experiment. The goal is to parameterize manifold-like white matter tracts, such as the corpus callosum, using principal surfaces. We approach the problem by finding a geometrically motivated surface-based representation of the corpus callosum and visualize the fractional anisotropy (FA) values projected onto the surface; the method applies to any other diffusion summary as well as to other white matter tracts. We provide an algorithm that 1) constructs the principal surface of a corpus callosum; 2) flattens the surface into a parametric 2D map; 3) projects associated FA values on the map. The algorithm was applied to a longitudinal study containing 466 diffusion tensor images of 176 multiple sclerosis (MS) patients observed at multiple visits. For each subject and visit the study contains a registered DTI scan of the corpus callosum at roughly 20,000 voxels. Extensive simulation studies demonstrate fast convergence and robust performance of the algorithm under a variety of challenging scenarios.

preprint2013arXiv

Structured Functional Principal Component Analysis

Motivated by modern observational studies, we introduce a class of functional models that expands nested and crossed designs. These models account for the natural inheritance of correlation structure from sampling design in studies where the fundamental sampling unit is a function or image. Inference is based on functional quadratics and their relationship with the underlying covariance structure of the latent processes. A computationally fast and scalable estimation procedure is developed for ultra-high dimensional data. Methods are illustrated in three examples: high-frequency accelerometer data for daily activity, pitch linguistic data for phonetic analysis, and EEG data for studying electrical brain activity during sleep.

Vadim Zipunnikov

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Scalar-on-distribution regression via generalized odds with applications to accelerometry-assessed disability in multiple sclerosis

Shape-constrained Estimation in Functional Regression with Bernstein Polynomials

Stride variability measures derived from wrist- and hip-worn accelerometers

Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis

Fast Covariance Estimation for High-dimensional Functional Data

Fast, Exact Bootstrap Principal Component Analysis for p>1 million

Parametrization of white matter manifold-like structures using principal surfaces

Structured Functional Principal Component Analysis