Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

IntentVLA: Short-Horizon Intent Modeling for Aliased Robot Manipulation

Robot imitation data are often multimodal: similar visual-language observations may be followed by different action chunks because human demonstrators act with different short-horizon intents, task phases, or recent context. Existing frame-conditioned VLA policies infer each chunk from the current observation and instruction alone, so under partial observability they may resample different intents across adjacent replanning steps, leading to inter-chunk conflict and unstable execution. We introduce IntentVLA, a history-conditioned VLA framework that encodes recent visual observations into a compact short-horizon intent representation and uses it to condition chunk generation. We further introduce AliasBench, a 12-task ambiguity-aware benchmark on RoboTwin2 with matched training data and evaluation environments that isolate short-horizon observation aliasing. Across AliasBench, SimplerEnv, LIBERO, and RoboCasa, IntentVLA improves rollout stability and outperforms strong VLA baselines

preprint2026arXiv

MR-Align: Meta-Reasoning Informed Factuality Alignment for Large Reasoning Models

Large reasoning models (LRMs) show strong capabilities in complex reasoning, yet their marginal gains on evidence-dependent factual questions are limited. We find this limitation is partially attributable to a reasoning-answer hit gap, where the model identifies the correct facts during reasoning but fails to incorporate them into the final response, thereby reducing factual fidelity. To address this issue, we propose MR-ALIGN, a Meta-Reasoning informed alignment framework that enhances factuality without relying on external verifiers. MR-ALIGN quantifies state transition probabilities along the model's thinking process and constructs a transition-aware implicit reward that reinforces beneficial reasoning patterns while suppressing defective ones at the atomic thinking segments. This re-weighting reshapes token-level signals into probability-aware segment scores, encouraging coherent reasoning trajectories that are more conducive to factual correctness. Empirical evaluations across four factual QA datasets and one long-form factuality benchmark show that MR-ALIGN consistently improves accuracy and truthfulness while reducing misleading reasoning. These results highlight that aligning the reasoning process itself, rather than merely the outputs, is pivotal for advancing factuality in LRMs.

preprint2026arXiv

PhysBrain 1.0 Technical Report

Vision-language-action models have advanced rapidly, but robot trajectories alone provide limited coverage for learning broad physical understanding. PhysBrain 1.0 studies a complementary route: converting large-scale human egocentric video into structured physical commonsense supervision before robot adaptation. Our data engine extracts scene elements, spatial dynamics, action execution, and depth-aware relations, then turns them into question-answer supervision for training PhysBrain VLMs. The resulting physical priors are further transferred to VLA policies through a capability-preserving and language-sensitive adaptation design. Across multimodal QA benchmarks and embodied control benchmarks, including ERQA, PhysBench, SimplerEnv-WidowX, LIBERO, and RoboCasa, PhysBrain 1.0 achieves SOTA results and shows especially strong out-of-domain performance on SimplerEnv. These results suggest that scaling physical commonsense from human interaction video can provide an effective bridge from multimodal understanding to robot action.

preprint2022arXiv

Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods

Tree-based models such as decision trees and random forests (RF) are a cornerstone of modern machine-learning practice. To mitigate overfitting, trees are typically regularized by a variety of techniques that modify their structure (e.g. pruning). We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure, and instead regularizes the tree by shrinking the prediction over each node towards the sample means of its ancestors. The amount of shrinkage is controlled by a single regularization parameter and the number of data points in each ancestor. Since HS is a post-hoc method, it is extremely fast, compatible with any tree growing algorithm, and can be used synergistically with other regularization techniques. Extensive experiments over a wide variety of real-world datasets show that HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques. Moreover, we find that applying HS to each tree in an RF often improves accuracy, as well as its interpretability by simplifying and stabilizing its decision boundaries and SHAP values. We further explain the success of HS in improving prediction performance by showing its equivalence to ridge regression on a (supervised) basis constructed of decision stumps associated with the internal nodes of a tree. All code and models are released in a full-fledged package available on Github (github.com/csinva/imodels)

preprint2022arXiv

Instability, Computational Efficiency and Statistical Accuracy

Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an important special case. The limiting performance of such estimators depends on the properties of the population-level operator in the idealized limit of infinitely many samples. We develop a general framework that yields bounds on statistical accuracy based on the interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (in)stability when applied to an empirical object based on $n$ samples. Using this framework, we analyze both stable forms of gradient descent and some higher-order and unstable algorithms, including Newton's method and its cubic-regularized variant, as well as the EM algorithm. We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models. We exhibit cases in which an unstable algorithm can achieve the same statistical accuracy as a stable algorithm in exponentially fewer steps -- namely, with the number of iterations being reduced from polynomial to logarithmic in sample size $n$.

preprint2022arXiv

Learning Using Privileged Information for Zero-Shot Action Recognition

Zero-Shot Action Recognition (ZSAR) aims to recognize video actions that have never been seen during training. Most existing methods assume a shared semantic space between seen and unseen actions and intend to directly learn a mapping from a visual space to the semantic space. This approach has been challenged by the semantic gap between the visual space and semantic space. This paper presents a novel method that uses object semantics as privileged information to narrow the semantic gap and, hence, effectively, assist the learning. In particular, a simple hallucination network is proposed to implicitly extract object semantics during testing without explicitly extracting objects and a cross-attention module is developed to augment visual feature with the object semantics. Experiments on the Olympic Sports, HMDB51 and UCF101 datasets have shown that the proposed method outperforms the state-of-the-art methods by a large margin.

preprint2022arXiv

Provable Boolean Interaction Recovery from Tree Ensemble obtained via Random Forests

Random Forests (RF) are at the cutting edge of supervised machine learning in terms of prediction performance, especially in genomics. Iterative Random Forests (iRF) use a tree ensemble from iteratively modified RF to obtain predictive and stable non-linear or Boolean interactions of features. They have shown great promise for Boolean biological interaction discovery that is central to advancing functional genomics and precision medicine. However, theoretical studies into how tree-based methods discover Boolean feature interactions are missing. Inspired by the thresholding behavior in many biological processes, we first introduce a novel discontinuous nonlinear regression model, called the Locally Spiky Sparse (LSS) model. Specifically, the LSS model assumes that the regression function is a linear combination of piecewise constant Boolean interaction terms. Given an RF tree ensemble, we define a quantity called Depth-Weighted Prevalence (DWP) for a set of signed features S. Intuitively speaking, DWP(S) measures how frequently features in S appear together in an RF tree ensemble. We prove that, with high probability, DWP(S) attains a universal upper bound that does not involve any model coefficients, if and only if S corresponds to a union of Boolean interactions under the LSS model. Consequentially, we show that a theoretically tractable version of the iRF procedure, called LSSFind, yields consistent interaction discovery under the LSS model as the sample size goes to infinity. Finally, simulation results show that LSSFind recovers the interactions under the LSS model even when some assumptions are violated.

preprint2022arXiv

Seven Principles for Rapid-Response Data Science: Lessons Learned from Covid-19 Forecasting

In this article, we take a step back to distill seven principles out of our experience in the spring of 2020, when our 12-person rapid-response team used skills of data science and beyond to help distribute Covid PPE. This process included tapping into domain knowledge of epidemiology and medical logistics chains, curating a relevant data repository, developing models for short-term county-level death forecasting in the US, and building a website for sharing visualization (an automated AI machine). The principles are described in the context of working with Response4Life, a then-new nonprofit organization, to illustrate their necessity. Many of these principles overlap with those in standard data-science teams, but an emphasis is put on dealing with problems that require rapid response, often resembling agile software development.

preprint2022arXiv

SOFFLFM: Super-resolution optical fluctuation Fourier light-field microscopy

Fourier light-field microscopy (FLFM) uses a micro-lens array (MLA) to segment the Fourier Plane of the microscopic objective lens to generate multiple two-dimensional perspective views, thereby reconstructing the three-dimensional(3D) structure of the sample using 3D deconvolution calculation without scanning. However, the resolution of FLFM is still limited by diffraction, and furthermore, dependent on the aperture division. In order to improve its resolution, a Super-resolution optical fluctuation Fourier light field microscopy (SOFFLFM) was proposed here, in which the Sofi method with ability of super-resolution was introduced into FLFM. SOFFLFM uses higher-order cumulants statistical analysis on an image sequence collected by FLFM, and then carries out 3D deconvolution calculation to reconstruct the 3D structure of the sample. Theoretical basis of SOFFLFM on improving resolution was explained and then verified with simulations. Simulation results demonstrated that SOFFLFM improved lateral and axial resolution by more than sqrt(2) and 2 times in the 2nd and 4th order accumulations, compared with that of FLFM.

preprint2022arXiv

Towards Robust Waveform-Based Acoustic Models

We study the problem of learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions. This problem is of paramount importance for the deployment of speech recognition systems that need to perform well in unseen environments. First, we characterize data augmentation theoretically as an instance of vicinal risk minimization, which aims at improving risk estimates during training by replacing the delta functions that define the empirical density over the input space with an approximation of the marginal population density in the vicinity of the training samples. More specifically, we assume that local neighborhoods centered at training samples can be approximated using a mixture of Gaussians, and demonstrate theoretically that this can incorporate robust inductive bias into the learning process. We then specify the individual mixture components implicitly via data augmentation schemes, designed to address common sources of spurious correlations in acoustic models. To avoid potential confounding effects on robustness due to information loss, which has been associated with standard feature extraction techniques (e.g., FBANK and MFCC features), we focus on the waveform-based setting. Our empirical results show that the approach can generalize to unseen noise conditions, with 150% relative improvement in out-of-distribution generalization compared to training using the standard risk minimization principle. Moreover, the results demonstrate competitive performance relative to models learned using a training sample designed to match the acoustic conditions characteristic of test utterances.

preprint2021arXiv

Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

Hamiltonian Monte Carlo (HMC) is a state-of-the-art Markov chain Monte Carlo sampling algorithm for drawing samples from smooth probability densities over continuous spaces. We study the variant most widely used in practice, Metropolized HMC with the Störmer-Verlet or leapfrog integrator, and make two primary contributions. First, we provide a non-asymptotic upper bound on the mixing time of the Metropolized HMC with explicit choices of step-size and number of leapfrog steps. This bound gives a precise quantification of the faster convergence of Metropolized HMC relative to simpler MCMC algorithms such as the Metropolized random walk, or Metropolized Langevin algorithm. Second, we provide a general framework for sharpening mixing time bounds of Markov chains initialized at a substantial distance from the target distribution over continuous spaces. We apply this sharpening device to the Metropolized random walk and Langevin algorithms, thereby obtaining improved mixing time bounds from a non-warm initial distribution.

preprint2020arXiv

A Survey on Dynamic Network Embedding

Real-world networks are composed of diverse interacting and evolving entities, while most of existing researches simply characterize them as particular static networks, without consideration of the evolution trend in dynamic networks. Recently, significant progresses in tracking the properties of dynamic networks have been made, which exploit changes of entities and links in the network to devise network embedding techniques. Compared to widely proposed static network embedding methods, dynamic network embedding endeavors to encode nodes as low-dimensional dense representations that effectively preserve the network structures and the temporal dynamics, which is beneficial to multifarious downstream machine learning tasks. In this paper, we conduct a systematical survey on dynamic network embedding. In specific, basic concepts of dynamic network embedding are described, notably, we propose a novel taxonomy of existing dynamic network embedding techniques for the first time, including matrix factorization based, Skip-Gram based, autoencoder based, neural networks based and other embedding methods. Additionally, we carefully summarize the commonly used datasets and a wide variety of subsequent tasks that dynamic network embedding can benefit. Afterwards and primarily, we suggest several challenges that the existing algorithms faced and outline possible directions to facilitate the future research, such as dynamic embedding models, large-scale dynamic networks, heterogeneous dynamic networks, dynamic attributed networks, task-oriented dynamic network embedding and more embedding spaces.

preprint2020arXiv

A Systematic Study of the dust of Galactic Supernova Remnants I. The Distance and the Extinction

By combining the photometric, spectroscopic, and astrometric information of the stars in the sightline of SNRs, the distances to and the extinctions of 32 Galactic supernova remnants (SNRs) are investigated. The stellar atmospheric parameters are from the SDSS$-$DR14$/$APOGEE and LAMOST$-$DR5$/$LEGUE spectroscopic surveys. The multi-band photometry, from optical to infrared, are collected from the {\it Gaia}, APASS, Pan--STARRS1, 2MASS, and {\it WISE} surveys. With the calibrated {\it Gaia} distances of individual stars, the distances to 15 of 32 SNRs are well determined from their produced extinction and association with molecular clouds. The upper limits of distance are derived for 3 SNRs. The color excess ratios $E(g_{\rm P1}-λ) / E(g_{\rm P1}-r_{\rm P1})$ of 32 SNRs are calculated, and their variation with wavebands is fitted by a simple dust model. The inferred dust grain size distribution bifurcates: while the graphite grains have comparable size to the average ISM dust, the silicate grains are generally larger. Along the way, the average extinction law from optical to near-infrared of the Milky Way is derived from the 1.3 million star sample and found to agree with the CCM89 law with $R_{\rm V}=3.15$.

preprint2020arXiv

Classifying expanding attractors on figure eight knot complement space and non-transitive Anosov flows on Franks-Williams manifold

The path closure of figure eight knot complement space, $N_0$, supports a natural DA (derived from Anosov) expanding attractor. Using this attractor, Franks-Williams constructed the first example of non-transitive Anosov flow on the manifold $M_0$ obtained by gluing two copies of $N_0$ through identity map along their boundaries, named by Franks-Williams manifold. In this paper, our main goal is to classify expanding attractors on $N_0$ and non-transitive Anosov flows on $M_0$. We prove that, up to orbit-equivalence, the DA expanding attractor is the unique expanding attractor supported by $N_0$, and the non-transitive Anosov flow constructed by Franks and Williams is the unique non-transitive Anosov flow admitted by $M_0$. Moreover, more general cases are also discussed. In particular, we completely classify non-transitive Anosov flows on a family of infinitely many toroidal $3$-manifolds with two hyperbolic pieces, obtained by gluing two copies of $N_0$ through any gluing homeomorphism.

preprint2020arXiv

Incremental causal effects

Causal evidence is needed to act and it is often enough for the evidence to point towards a direction of the effect of an action. For example, policymakers might be interested in estimating the effect of slightly increasing taxes on private spending across the whole population. We study identifiability and estimation of causal effects, where a continuous treatment is slightly shifted across the whole population (termed average partial effect or incremental causal effect). We show that incremental effects are identified under local ignorability and local overlap assumptions, where exchangeability and positivity only hold in a neighborhood of units. Average treatment effects are not identified under these assumptions. In this case, and under a smoothness condition, the incremental effect can be estimated via the average derivative. Moreover, we prove that in certain finite-sample observational settings, estimating the incremental effect is easier than estimating the average treatment effect in terms of asymptotic variance. For high-dimensional settings, we develop a simple feature transformation that allows for doubly-robust estimation and inference of incremental causal effects. Finally, we compare the behaviour of estimators of the incremental treatment effect and average treatment effect in experiments including data-inspired simulations.

preprint2020arXiv

Singularity, Misspecification, and the Convergence Rate of EM

A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument. Examples include suitably separated Gaussian mixture models and mixtures of linear regressions. We consider over-specified settings in which the number of fitted components is larger than the number of components in the true distribution. Such misspecified settings can lead to singularity in the Fisher information matrix, and moreover, the maximum likelihood estimator based on $n$ i.i.d. samples in $d$ dimensions can have a non-standard $\mathcal{O}((d/n)^{\frac{1}{4}})$ rate of convergence. Focusing on the simple setting of two-component mixtures fit to a $d$-dimensional Gaussian distribution, we study the behavior of the EM algorithm both when the mixture weights are different (unbalanced case), and are equal (balanced case). Our analysis reveals a sharp distinction between these two cases: in the former, the EM algorithm converges geometrically to a point at Euclidean distance of $\mathcal{O}((d/n)^{\frac{1}{2}})$ from the true parameter, whereas in the latter case, the convergence rate is exponentially slower, and the fixed point has a much lower $\mathcal{O}((d/n)^{\frac{1}{4}})$ accuracy. Analysis of this singular case requires the introduction of some novel techniques: in particular, we make use of a careful form of localization in the associated empirical process, and develop a recursive argument to progressively sharpen the statistical rate.

preprint2020arXiv

Structural Compression of Convolutional Neural Networks

Deep convolutional neural networks (CNNs) have been successful in many tasks in machine vision, however, millions of weights in the form of thousands of convolutional filters in CNNs makes them difficult for human intepretation or understanding in science. In this article, we introduce CAR, a greedy structural compression scheme to obtain smaller and more interpretable CNNs, while achieving close to original accuracy. The compression is based on pruning filters with the least contribution to the classification accuracy. We demonstrate the interpretability of CAR-compressed CNNs by showing that our algorithm prunes filters with visually redundant functionalities such as color filters. These compressed networks are easier to interpret because they retain the filter diversity of uncompressed networks with order of magnitude less filters. Finally, a variant of CAR is introduced to quantify the importance of each image category to each CNN filter. Specifically, the most and the least important class labels are shown to be meaningful interpretations of each filter.

preprint2019arXiv

Veridical Data Science

Building and expanding on principles of statistics, machine learning, and scientific inquiry, we propose the predictability, computability, and stability (PCS) framework for veridical data science. Our framework, comprised of both a workflow and documentation, aims to provide responsible, reliable, reproducible, and transparent results across the entire data science life cycle. The PCS workflow uses predictability as a reality check and considers the importance of computation in data collection/storage and algorithm design. It augments predictability and computability with an overarching stability principle for the data science life cycle. Stability expands on statistical uncertainty considerations to assess how human judgment calls impact data results through data and model/algorithm perturbations. Moreover, we develop inference procedures that build on PCS, namely PCS perturbation intervals and PCS hypothesis testing, to investigate the stability of data results relative to problem formulation, data cleaning, modeling decisions, and interpretations. We illustrate PCS inference through neuroscience and genomics projects of our own and others and compare it to existing methods in high dimensional, sparse linear model simulations. Over a wide range of misspecified simulation models, PCS inference demonstrates favorable performance in terms of ROC curves. Finally, we propose PCS documentation based on R Markdown or Jupyter Notebook, with publicly available, reproducible codes and narratives to back up human choices made throughout an analysis. The PCS workflow and documentation are demonstrated in a genomics case study available on Zenodo.

preprint2017arXiv

Iterative Random Forests to detect predictive and stable high-order interactions

Genomics has revolutionized biology, enabling the interrogation of whole transcriptomes, genome-wide binding sites for proteins, and many other molecular processes. However, individual genomic assays measure elements that interact in vivo as components of larger molecular machines. Understanding how these high-order interactions drive gene expression presents a substantial statistical challenge. Building on Random Forests (RF), Random Intersection Trees (RITs), and through extensive, biologically inspired simulations, we developed the iterative Random Forest algorithm (iRF). iRF trains a feature-weighted ensemble of decision trees to detect stable, high-order interactions with same order of computational cost as RF. We demonstrate the utility of iRF for high-order interaction discovery in two prediction problems: enhancer activity in the early Drosophila embryo and alternative splicing of primary transcripts in human derived cell lines. In Drosophila, among the 20 pairwise transcription factor interactions iRF identifies as stable (returned in more than half of bootstrap replicates), 80% have been previously reported as physical interactions. Moreover, novel third-order interactions, e.g. between Zelda (Zld), Giant (Gt), and Twist (Twi), suggest high-order relationships that are candidates for follow-up experiments. In human-derived cells, iRF re-discovered a central role of H3K36me3 in chromatin-mediated splicing regulation, and identified novel 5th and 6th order interactions, indicative of multi-valent nucleosomes with specific roles in splicing regulation. By decoupling the order of interactions from the computational cost of identification, iRF opens new avenues of inquiry into the molecular mechanisms underlying genome biology.

preprint2015arXiv

Lasso adjustments of treatment effect estimates in randomized experiments

We provide a principled way for investigators to analyze randomized experiments when the number of covariates is large. Investigators often use linear multivariate regression to analyze randomized experiments instead of simply reporting the difference of means between treatment and control groups. Their aim is to reduce the variance of the estimated treatment effect by adjusting for covariates. If there are a large number of covariates relative to the number of observations, regression may perform poorly because of overfitting. In such cases, the Lasso may be helpful. We study the resulting Lasso-based treatment effect estimator under the Neyman-Rubin model of randomized experiments. We present theoretical conditions that guarantee that the estimator is more efficient than the simple difference-of-means estimator, and we provide a conservative estimator of the asymptotic variance, which can yield tighter confidence intervals than the difference-of-means estimator. Simulation and data examples show that Lasso-based adjustment can be advantageous even when the number of covariates is less than the number of observations. Specifically, a variant using Lasso for selection and OLS for estimation performs particularly well, and it chooses a smoothing parameter based on combined performance of Lasso and OLS.