Source author record

Zhenhua Lin

Zhenhua Lin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory math.DG

Catalog footprint

What is connected

9works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Multi-transport Distributional Regression

We study distribution-on-distribution regression problems in which a response distribution depends on multiple distributional predictors. Such settings arise naturally in applications where the outcome distribution is driven by several heterogeneous distributional sources, yet remain challenging due to the nonlinear geometry of the Wasserstein space. We propose an intrinsic regression framework that aggregates predictor-specific transported distributions through a weighted Fréchet mean in the Wasserstein space. The resulting model admits multiple distributional predictors, assigns interpretable weights quantifying their relative contributions, and defines a flexible regression operator that is invariant to auxiliary construction choices, such as the selection of a reference distribution. From a theoretical perspective, we establish identifiability of the induced regression operator and derive asymptotic guarantees for its estimation under a predictive Wasserstein semi-norm, which directly characterizes convergence of the composite prediction map. Extensive simulation studies and a real data application demonstrate the improved predictive performance and interpretability of the proposed approach compared with existing Wasserstein regression methods.

preprint2022arXiv

Intrinsic Riemannian Functional Data Analysis for Sparse Longitudinal Observations

A new framework is developed to intrinsically analyze sparsely observed Riemannian functional data. It features four innovative components: a frame-independent covariance function, a smooth vector bundle termed covariance vector bundle, a parallel transport and a smooth bundle metric on the covariance vector bundle. The introduced intrinsic covariance function links estimation of covariance structure to smoothing problems that involve raw covariance observations derived from sparsely observed Riemannian functional data, while the covariance vector bundle provides a rigorous mathematical foundation for formulating such smoothing problems. The parallel transport and the bundle metric together make it possible to measure fidelity of fit to the covariance function. They also play a critical role in quantifying the quality of estimators for the covariance function. As an illustration, based on the proposed framework, we develop a local linear smoothing estimator for the covariance function, analyze its theoretical properties, and provide numerical demonstration via simulated and real datasets. The intrinsic feature of the framework makes it applicable to not only Euclidean submanifolds but also manifolds without a canonical ambient space.

preprint2022arXiv

Optimal One-pass Nonparametric Estimation Under Memory Constraint

For nonparametric regression in the streaming setting, where data constantly flow in and require real-time analysis, a main challenge is that data are cleared from the computer system once processed due to limited computer memory and storage. We tackle the challenge by proposing a novel one-pass estimator based on penalized orthogonal basis expansions and developing a general framework to study the interplay between statistical efficiency and memory consumption of estimators. We show that, the proposed estimator is statistically optimal under memory constraint, and has asymptotically minimal memory footprints among all one-pass estimators of the same estimation quality. Numerical studies demonstrate that the proposed one-pass estimator is nearly as efficient as its non-streaming counterpart that has access to all historical data.

preprint2020arXiv

Additive Models for Symmetric Positive-Definite Matrices, Riemannian Manifolds and Lie groups

In this paper an additive regression model for a symmetric positive-definite matrix valued response and multiple scalar predictors is proposed. The model exploits the abelian group structure inherited from either the Log-Cholesky metric or the Log-Euclidean framework that turns the space of symmetric positive-definite matrices into a Riemannian manifold and further a bi-invariant Lie group. The additive model for responses in the space of symmetric positive-definite matrices with either of these metrics is shown to connect to an additive model on a tangent space. This connection not only entails an efficient algorithm to estimate the component functions but also allows to generalize the proposed additive model to general Riemannian manifolds that might not have a Lie group structure. Optimal asymptotic convergence rates and normality of the estimated component functions are also established. Numerical studies show that the proposed model enjoys superior numerical performance, especially when there are multiple predictors. The practical merits of the proposed model are demonstrated by analyzing diffusion tensor brain imaging data.

preprint2020arXiv

Basis Expansions for Functional Snippets

Estimation of mean and covariance functions is fundamental for functional data analysis. While this topic has been studied extensively in the literature, a key assumption is that there are enough data in the domain of interest to estimate both the mean and covariance functions. In this paper, we investigate mean and covariance estimation for functional snippets in which observations from a subject are available only in an interval of length strictly (and often much) shorter than the length of the whole interval of interest. For such a sampling plan, no data is available for direct estimation of the off-diagonal region of the covariance function. We tackle this challenge via a basis representation of the covariance function. The proposed approach allows one to consistently estimate an infinite-rank covariance function from functional snippets. We establish the convergence rates for the proposed estimators and illustrate their finite-sample performance via simulation studies and two data applications.

preprint2020arXiv

Functional Regression on Manifold with Contamination

We propose a new method for functional nonparametric regression with a predictor that resides on a finite-dimensional manifold but is only observable in an infinite-dimensional space. Contamination of the predictor due to discrete/noisy measurements is also accounted for. By using functional local linear manifold smoothing, the proposed estimator enjoys a polynomial rate of convergence that adapts to the intrinsic manifold dimension and the contamination level. This is in contrast to the logarithmic convergence rate in the literature of functional nonparametric regression. We also observe a phase transition phenomenon regarding the interplay of the manifold dimension and the contamination level. We demonstrate that the proposed method has favorable numerical performance relative to commonly used methods via simulated and real data examples.

preprint2020arXiv

Mean and Covariance Estimation for Functional Snippets

We consider estimation of mean and covariance functions of functional snippets, which are short segments of functions possibly observed irregularly on an individual specific subinterval that is much shorter than the entire study interval. Estimation of the covariance function for functional snippets is challenging since information for the far off-diagonal regions of the covariance structure is completely missing. We address this difficulty by decomposing the covariance function into a variance function component and a correlation function component. The variance function can be effectively estimated nonparametrically, while the correlation part is modeled parametrically, possibly with an increasing number of parameters, to handle the missing information in the far off-diagonal regions. Both theoretical analysis and numerical simulations suggest that this hybrid strategy % divide-and-conquer strategy is effective. In addition, we propose a new estimator for the variance of measurement errors and analyze its asymptotic properties. This estimator is required for the estimation of the variance function from noisy measurements.

preprint2019arXiv

Riemannian Geometry of Symmetric Positive Definite Matrices via Cholesky Decomposition

We present a new Riemannian metric, termed Log-Cholesky metric, on the manifold of symmetric positive definite (SPD) matrices via Cholesky decomposition. We first construct a Lie group structure and a bi-invariant metric on Cholesky space, the collection of lower triangular matrices whose diagonal elements are all positive. Such group structure and metric are then pushed forward to the space of SPD matrices via the inverse of Cholesky decomposition that is a bijective map between Cholesky space and SPD matrix space. This new Riemannian metric and Lie group structure fully circumvent swelling effect, in the sense that the determinant of the Fréchet average of a set of SPD matrices under the presented metric, called Log-Cholesky average, is between the minimum and the maximum of the determinants of the original SPD matrices. Comparing to existing metrics such as the affine-invariant metric and Log-Euclidean metric, the presented metric is simpler, more computationally efficient and numerically stabler. In particular, parallel transport along geodesics under Log-Cholesky metric is given in a closed and easy-to-compute form.

preprint2015arXiv

A Smooth and Locally Sparse Estimator for Functional Linear Regression via Functional SCAD Penalty

In this paper, we propose a new regularization technique called "functional SCAD". We then combine this technique with the smoothing spline method to develop a smooth and locally sparse (i.e., zero on some sub-regions) estimator for the coefficient function in functional linear regression. The functional SCAD has a nice shrinkage property that enables our estimating procedure to identify the null subregions of the coefficient function without over shrinking the non-zero values of the coefficient function. Additionally, the smoothness of our estimated coefficient function is regularized by a roughness penalty rather than by controlling the number of knots. Our method is more theoretically sound and is computationally simpler than the other available methods. An asymptotic analysis shows that our estimator is consistent and can identify the null region with the probability tending to one. Furthermore, simulation studies show that our estimator has superior numerical performance. Finally, the practical merit of our method is demonstrated on two real applications.

Zhenhua Lin

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Multi-transport Distributional Regression

Intrinsic Riemannian Functional Data Analysis for Sparse Longitudinal Observations

Optimal One-pass Nonparametric Estimation Under Memory Constraint

Additive Models for Symmetric Positive-Definite Matrices, Riemannian Manifolds and Lie groups

Basis Expansions for Functional Snippets

Functional Regression on Manifold with Contamination

Mean and Covariance Estimation for Functional Snippets

Riemannian Geometry of Symmetric Positive Definite Matrices via Cholesky Decomposition

A Smooth and Locally Sparse Estimator for Functional Linear Regression via Functional SCAD Penalty