Source author record

Falong Tan

Falong Tan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Artificial Intelligence Computer Vision Machine Learning

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Asymptotic Distribution-Free Tests for Ultra-high Dimensional Parametric Regressions via Projected Empirical Processes and $p$-value Combination

This paper develops a novel methodology for testing the goodness-of-fit of sparse parametric regression models based on projected empirical processes and p-value combination, where the covariate dimension may substantially exceed the sample size. In such ultra-high dimensional settings, traditional empirical process-based tests often fail due to the curse of dimensionality or their reliance on the asymptotic linearity and normality of parameter estimators--properties that may not hold under ultra-high dimensional scenarios. To overcome these challenges, we first extend the classic martingale transformation to ultra-high dimensional settings under mild conditions and construct a Cramer-von Mises type test based on a martingale-transformed, projected residual-marked empirical process for any projection on the unit sphere. The martingale transformation renders this projected test asymptotically distribution-free and enables us to derive its limiting distribution using only standard convergence rates of parameter estimators. While the projected test is consistent for almost all projections on the unit sphere under mild conditions, it may still suffer from power loss for specific projections. Therefore, we further employ powerful p-value combination procedures, such as the Cauchy combination, to aggregate p-values across multiple projections, thereby enhancing overall robustness. Furthermore, recognizing that empirical process-based tests excel at detecting low-frequency signals while local smoothing tests are generally superior for high-frequency alternatives, we propose a novel hybrid test that aggregates both approaches using Cauchy combination. The resulting hybrid test is powerful against both low-frequency and high-frequency alternatives. $\cdots$

preprint2023arXiv

Boosting Out-of-Distribution Detection with Multiple Pre-trained Models

Out-of-Distribution (OOD) detection, i.e., identifying whether an input is sampled from a novel distribution other than the training distribution, is a critical task for safely deploying machine learning systems in the open world. Recently, post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems. This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods? In this work, we propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models. Our approach uses the p-value instead of the commonly used hard threshold and leverages a fundamental framework of multiple hypothesis testing to control the true positive rate of In-Distribution (ID) data. We focus on the usage of model zoos and provide systematic empirical comparisons with current state-of-the-art methods on various OOD detection benchmarks. The proposed ensemble scheme shows consistent improvement compared to single-model detectors and significantly outperforms the current competitive methods. Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.

preprint2022arXiv

Testing the parametric form of the conditional variance in regressions based on distance covariance

In this paper, we propose a new test for checking the parametric form of the conditional variance based on distance covariance in nonlinear and nonparametric regression models. Inherit from the nice properties of distance covariance, our test is very easy to implement in practice and less effected by the dimensionality of covariates. The asymptotic properties of the test statistic are investigated under the null and alternative hypotheses. We show that the proposed test is consistent against any alternative and can detect local alternatives converging to the null hypothesis at the parametric rate 1/root(n) in both the nonlinear and nonparametric settings. As the limiting null distribution of the test statistic is intractable, we propose a residual bootstrap to approximate the limiting null distribution. Simulation studies are presented to assess the finite sample performance of the proposed test. We also apply the proposed test to a real data set for illustration.

preprint2020arXiv

Integrated conditional moment test and beyond: when the number of covariates is divergent

The classic integrated conditional moment test is a promising method for testing regression model misspecification. However, it severely suffers from the curse of dimensionality. To extend it to handle the testing problem for parametric multi-index models with diverging number of covariates, we investigate three issues in inference in this paper. First, we study the consistency and asymptotically linear representation of the least squares estimator of the parameter matrix at faster rates of divergence than those in the literature for nonlinear models. Second, we propose, via sufficient dimension reduction techniques, an adaptive-to-model version of the integrated conditional moment test. We study the asymptotic properties of the new test under both the null and alternative hypothesis to examine its ability of significance level maintenance and its sensitivity to the global and local alternatives that are distinct from the null at the fastest possible rate in hypothesis testing. Third, we derive the consistency of the bootstrap approximation for the new test in the diverging dimension setting. The numerical studies show that the new test can very much enhance the performance of the original ICM test in high-dimensional scenarios. We also apply the test to a real data set for illustrations.

preprint2016arXiv

A projection-based adaptive-to-model test for regressions

A longstanding problem of existing empirical process-based tests for regressions is that when the number of covariates is greater than one, they either have no tractable limiting null distributions or are not omnibus. To attack this problem, we in this paper propose a projection-based adaptive-to-model approach. When the hypothetical model is parametric single-index, the method can fully utilize the dimension reduction model structure under the null hypothesis as if the covariate were one-dimensional such that the martingale transformation-based test can be asymptotically distribution-free. Further, the test can automatically adapt to the underlying model structure such that the test can be omnibus and thus detect alternative models distinct from the hypothetical model at the fastest possible rate in hypothesis testing. The method is examined through simulation studied and is illustrated by a real data analysis.