Researcher profile

Hao Yan

Hao Yan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers

Transformers are effective at inferring the latent task from context via two inference modes: recognizing a task seen during training, and adapting to a novel one. Recent interpretability studies have identified from middle-layer representations task-specific directions, or task vectors, that steer model behavior. However, a lack of rigorous foundations hinders connecting internal representations to external model behavior: existing work fails to explain how task-vector geometry is shaped by the training distribution, and what geometry enables out-of-distribution (OOD) generalization. In this paper, we study these questions in a controlled synthetic setting by training small transformers from scratch on latent-task sequence distributions, which allows a principled mathematical characterization. We show that two inference modes can coexist within a single model. In-distribution behavior is governed by Bayesian task retrieval, implemented internally through convex combinations of learned task vectors. OOD behavior, by contrast, arises through extrapolative task learning, whose representations occupy a subspace nearly orthogonal to the task-vector subspace. Taken together, our results suggest that task-vector geometry, training distributions, and generalization behaviors are closely related.

preprint2023arXiv

Quasi-monolithic Compact Interferometric Sensor Head Design with Laser Auto-alignment

Interferometers play a crucial role in high-precision displacement measurement such as gravitational-wave detection. Conventional interferometer designs require accurate laser alignment, including the laser pointing and the waist position, to maintain high interference contrast during motion. Although the corner reflector returns the reflected beam in parallel, there is still a problem of lateral beam shift which reduces the interference contrast. This paper presents a new compact interferometric sensor head design for measuring translations with auto-alignment. It works without laser beam alignment adjustment and maintains high interferometric contrast during arbitrary motion (tilts as well as lateral translation). Automatic alignment of the measuring beam with the reference beam is possible by means of a secondary reflection design with a corner reflector. A 20*10*10mm^3 all-glass quasi-monolithic sensor head is built based on UV adhesive bonding and tested by a piezoelectric (PZT) positioning stage. Our sensor head achieved a displacement sensitivity of 1 pm/Hz^1/2 at 1Hz with a tilt dynamic range over +/_200 mrad. This optical design can be widely used for high-precision displacement measurement over a large tilt dynamic range, such as torsion balances and seismometers.

preprint2022arXiv

Adaptive Partially-Observed Sequential Change Detection and Isolation

High-dimensional data has become popular due to the easy accessibility of sensors in modern industrial applications. However, one specific challenge is that it is often not easy to obtain complete measurements due to limited sensing powers and resource constraints. Furthermore, distinct failure patterns may exist in the systems, and it is necessary to identify the true failure pattern. This work focuses on the online adaptive monitoring of high-dimensional data in resource-constrained environments with multiple potential failure modes. To achieve this, we propose to apply the Shiryaev-Roberts procedure on the failure mode level and utilize the multi-arm bandit to balance the exploration and exploitation. We further discuss the theoretical property of the proposed algorithm to show that the proposed method can correctly isolate the failure mode. Finally, extensive simulations and two case studies demonstrate that the change point detection performance and the failure mode isolation accuracy can be greatly improved.

preprint2022arXiv

Adaptive Resources Allocation CUSUM for Binomial Count Data Monitoring with Application to COVID-19 Hotspot Detection

In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted update is used to update the posterior distribution of the infection rate. Then, the upper confidence bound (UCB) is used for resource allocation and planning. Finally, CUSUM monitoring statistics to detect the change point as well as the change location. For performance evaluation, we compare the performance of the proposed method with several benchmark methods in the literature and showed the proposed algorithm is able to achieve a lower detection delay and higher detection precision. Finally, this method is applied to hotspot detection in a real case study of county-level daily positive COVID-19 cases in Washington State WA) and demonstrates the effectiveness with very limited distributed samples.

preprint2022arXiv

ANTLER: Bayesian Nonlinear Tensor Learning and Modeler for Unstructured, Varying-Size Point Cloud Data

Unstructured point clouds with varying sizes are increasingly acquired in a variety of environments through laser triangulation or Light Detection and Ranging (LiDAR). Predicting a scalar response based on unstructured point clouds is a common problem that arises in a wide variety of applications. The current literature relies on several pre-processing steps such as structured subsampling and feature extraction to analyze the point cloud data. Those techniques lead to quantization artifacts and do not consider the relationship between the regression response and the point cloud during pre-processing. Therefore, we propose a general and holistic "Bayesian Nonlinear Tensor Learning and Modeler" (ANTLER) to model the relationship of unstructured, varying-size point cloud data with a scalar or multivariate response. The proposed ANTLER simultaneously optimizes a nonlinear tensor dimensionality reduction and a nonlinear regression model with a 3D point cloud input and a scalar or multivariate response. ANTLER has the ability to consider the complex data representation, high-dimensionality,and inconsistent size of the 3D point cloud data.

preprint2022arXiv

Filters for ISI Suppression in Molecular Communication via Diffusion

Molecular communication via diffusion (MCvD) is considered as one of the most feasible communication paradigms for nanonetworks, especially for bio-nanonetworks which are usually in water-rich biological environments. Two effects that deteriorates the signal in MCvD are noise and inter-symbol interference (ISI). The expected channel impulse response of MCvD has a long and slow attenuating tail due to molecular diffusion which causes ISI and further limits the slow data rate of MCvD. The extent that ISI and noise are suppressed in an MCvD system determines its effectiveness, especially at a high data rate. Although ISI-suppression approaches have been investigated, most of them are addressed as non-essential parts in other topics, such as signal detection or modulation. Furthermore, most of the state-of-the-art ISI-suppression approaches are performed by subtracting the estimated ISI from the total signal. In this work, we investigate ISI-suppression from a new perspective of filters to filter ISI out without any ISI estimation. The principles for a good design of ISI-suppression filters in MCvD are investigated. Based on the principles, an ISI-suppression filter with good anti-noise capability and an associated signal detection scheme is proposed for MCvD scenarios with both ISI and noise. We compare the proposed scheme with the state-of-the-art ISI-suppression approaches. The result manifests that the proposed ISI-suppression scheme could recover signals deteriorated severely by both ISI and noise, which could not be effectively detected by the state-of-the-art ISI-suppression approaches.

preprint2022arXiv

Toward a Better Monitoring Statistic for Profile Monitoring via Variational Autoencoders

Wide accessibility of imaging and profile sensors in modern industrial systems created an abundance of high-dimensional sensing variables. This led to a a growing interest in the research of high-dimensional process monitoring. However, most of the approaches in the literature assume the in-control population to lie on a linear manifold with a given basis (i.e., spline, wavelet, kernel, etc) or an unknown basis (i.e., principal component analysis and its variants), which cannot be used to efficiently model profiles with a nonlinear manifold which is common in many real-life cases. We propose deep probabilistic autoencoders as a viable unsupervised learning approach to model such manifolds. To do so, we formulate nonlinear and probabilistic extensions of the monitoring statistics from classical approaches as the expected reconstruction error (ERE) and the KL-divergence (KLD) based monitoring statistics. Through extensive simulation study, we provide insights on why latent-space based statistics are unreliable and why residual-space based ones typically perform much better for deep learning based approaches. Finally, we demonstrate the superiority of deep probabilistic models via both simulation study and a real-life case study involving images of defects from a hot steel rolling process.

preprint2021arXiv

Adaptive Change Point Monitoring for High-Dimensional Data

In this paper, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests developed by Wang et al.(2019) and Zhang et al.(2020), we advance the U-statistic based approach to the sequential monitoring problem by developing a new adaptive monitoring procedure that can detect both dense and sparse changes in real-time. Unlike Wang et al.(2019) and Zhang et al.(2020), where self-normalization was used in their tests, we instead introduce a class of estimators for $q$-norm of the covariance matrix and prove their ratio consistency. To facilitate fast computation, we further develop recursive algorithms to improve the computational efficiency of the monitoring procedure. The advantage of the proposed methodology is demonstrated via simulation studies and real data illustrations.

preprint2020arXiv

Automatic Storage Structure Selection for hybrid Workload

In the use of database systems, the design of the storage engine and data model directly affects the performance of the database when performing queries. Therefore, the users of the database need to select the storage engine and design data model according to the workload encountered. However, in a hybrid workload, the query set of the database is dynamically changing, and the design of its optimal storage structure is also changing. Motivated by this, we propose an automatic storage structure selection system based on learning cost, which is used to dynamically select the optimal storage structure of the database under hybrid workloads. In the system, we introduce a machine learning method to build a cost model for the storage engine, and a column-oriented data layout generation algorithm. Experimental results show that the proposed system can choose the optimal combination of storage engine and data model according to the current workload, which greatly improves the performance of the default storage structure. And the system is designed to be compatible with different storage engines for easy use in practical applications.

preprint2020arXiv

Long-Short Term Spatiotemporal Tensor Prediction for Passenger Flow Profile

Spatiotemporal data is very common in many applications, such as manufacturing systems and transportation systems. It is typically difficult to be accurately predicted given intrinsic complex spatial and temporal correlations. Most of the existing methods based on various statistical models and regularization terms, fail to preserve innate features in data alongside their complex correlations. In this paper, we focus on a tensor-based prediction and propose several practical techniques to improve prediction. For long-term prediction specifically, we propose the "Tensor Decomposition + 2-Dimensional Auto-Regressive Moving Average (2D-ARMA)" model, and an effective way to update prediction real-time; For short-term prediction, we propose to conduct tensor completion based on tensor clustering to avoid oversimplifying and ensure accuracy. A case study based on the metro passenger flow data is conducted to demonstrate the improved performance.

preprint2020arXiv

Partially Observable Online Change Detection via Smooth-Sparse Decomposition

We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able to adaptively and actively select the most important variables to observe to maximize the detection power. To address these two points, in this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition, whose parameters can be learned via spike-slab variational Bayesian inference. Then the posterior Bayes factor, which incorporates the learned parameters and sparse change information, is formulated as a detection statistic. Finally, by formulating the statistic as the reward of a combinatorial multi-armed bandit problem, an adaptive sampling strategy based on Thompson sampling is proposed. The efficacy and applicability of our method in practice are demonstrated with numerical studies and a real case study.

preprint2020arXiv

Rapid Detection of Hot-spot by Tensor Decomposition with Application to Weekly Gonorrhea Data

In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from $2006$ to $2018$ for $50$ states in the United States.

preprint2020arXiv

Rapid Detection of Hot-spots via Tensor Decomposition with applications to Crime Rate Data

We propose an efficient statistical method (denoted as SSR-Tensor) to robustly and quickly detect hot-spots that are sparse and temporal-consistent in a spatial-temporal dataset through the tensor decomposition. Our main idea is first to build an SSR model to decompose the tensor data into a Smooth global trend mean, Sparse local hot-spots, and Residuals. Next, tensor decomposition is utilized as follows: bases are introduced to describe within-dimension correlation, and tensor products are used for between-dimension interaction. Then, a combination of LASSO and fused LASSO is used to estimate the model parameters, where an efficient recursive estimation procedure is developed based on the large-scale convex optimization, where we first transform the general LASSO optimization into regular LASSO optimization and apply FISTA to solve it with the fastest convergence rate. Finally, a CUSUM procedure is applied to detect when and where the hot-spot event occurs. We compare the performance of the proposed method in a numerical simulation study and a real-world case study, which contains a dataset including a collection of three types of crime rates for U.S. mainland states during the year 1965-2014. In both cases, the proposed SSR-Tensor is able to achieve the fast detection and accurate localization of the hot-spots.

preprint2020arXiv

Real-time Detection of Clustered Events in Video-imaging data with Applications to Additive Manufacturing

The use of video-imaging data for in-line process monitoring applications has become more and more popular in the industry. In this framework, spatio-temporal statistical process monitoring methods are needed to capture the relevant information content and signal possible out-of-control states. Video-imaging data are characterized by a spatio-temporal variability structure that depends on the underlying phenomenon, and typical out-of-control patterns are related to the events that are localized both in time and space. In this paper, we propose an integrated spatio-temporal decomposition and regression approach for anomaly detection in video-imaging data. Out-of-control events are typically sparse spatially clustered and temporally consistent. Therefore, the goal is to not only detect the anomaly as quickly as possible ("when") but also locate it ("where"). The proposed approach works by decomposing the original spatio-temporal data into random natural events, sparse spatially clustered and temporally consistent anomalous events, and random noise. Recursive estimation procedures for spatio-temporal regression are presented to enable the real-time implementation of the proposed methodology. Finally, a likelihood ratio test procedure is proposed to detect when and where the hotspot happens. The proposed approach was applied to the analysis of video-imaging data to detect and locate local over-heating phenomena ("hotspots") during the layer-wise process in a metal additive manufacturing process.

preprint2018arXiv

A novel approach for fusion of heterogeneous sources of data

With advancements in sensor technology, a heterogeneous set of data, containing samples of scalar, waveform signal, image, or even structured point cloud are becoming increasingly popular. Developing a statistical model, representing the behavior of the underlying system based upon such a heterogeneous set of data can be used in monitoring, control, and optimization of the system. Unfortunately, available methods only focus on the scalar and curve data and do not provide a general framework that can integrate different sources of data to construct a model. This paper poses the problem of estimating a process output, measured by a scalar, curve, an image, or a point cloud by a set of heterogeneous process variables such as scalar process setting, sensor readings, and images. We introduce a general approach in which each set of input data (predictor) as well as the output measurements are represented by tensors. We formulate a linear regression model between the input and output tensors and estimate the parameters by minimizing a least square loss function. In order to avoid overfitting and to reduce the number of parameters to be estimated, we decompose the model parameters using several bases, spanning the input and output spaces. Next, we learn both the bases and their spanning coefficients when minimizing the loss function using an alternating least square (ALS) algorithm. We show that such a minimization has a closed-form solution in each iteration and can be computed very efficiently. Through several simulation and case studies, we evaluate the performance of the proposed method. The results reveal the advantage of the proposed method over some benchmarks in the literature in terms of the mean square prediction error.

preprint2018arXiv

Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization

Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra techniques and propose a set of tensor regression approaches to model the variational patterns of point clouds and to link them to process variables. The performance of the proposed methods is evaluated through simulations and a real case study of turning process optimization.