Researcher profile

Liyan Xie

Liyan Xie contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Sequential Change Detection with Differential Privacy

Sequential change detection is a fundamental problem in statistics and signal processing, with the CUSUM procedure widely used to achieve minimax detection delay under a prescribed false-alarm rate when pre- and post-change distributions are fully known. However, releasing CUSUM statistics and the corresponding stopping time directly can compromise individual data privacy. We therefore introduce a differentially private (DP) variant, called DP-CUSUM, that injects calibrated Laplace noise into both the vanilla CUSUM statistics and the detection threshold, preserving the recursive simplicity of the classical CUSUM statistics while ensuring per-sample differential privacy. We derive closed-form bounds on the average run length to false alarm and on the worst-case average detection delay, explicitly characterizing the trade-off among privacy level, false-alarm rate, and detection efficiency. Our theoretical results imply that under a weak privacy constraint, our proposed DP-CUSUM procedure achieves the same first-order asymptotic optimality as the classical, non-private CUSUM procedure. Numerical simulations are conducted to demonstrate the detection efficiency of our proposed DP-CUSUM under different privacy constraints, and the results are consistent with our theoretical findings.

preprint2022arXiv

Distributionally Robust Weighted $k$-Nearest Neighbors

Learning a robust classifier from a few samples remains a key challenge in machine learning. A major thrust of research has been focused on developing $k$-nearest neighbor ($k$-NN) based algorithms combined with metric learning that captures similarities between samples. When the samples are limited, robustness is especially crucial to ensure the generalization capability of the classifier. In this paper, we study a minimax distributionally robust formulation of weighted $k$-nearest neighbors, which aims to find the optimal weighted $k$-NN classifiers that hedge against feature uncertainties. We develop an algorithm, \texttt{Dr.k-NN}, that efficiently solves this functional optimization problem and features in assigning minimax optimal weights to training samples when performing classification. These weights are class-dependent, and are determined by the similarities of sample features under the least favorable scenarios. When the size of the uncertainty set is properly tuned, the robust classifier has a smaller Lipschitz norm than the vanilla $k$-NN, and thus improves the generalization capability. We also couple our framework with neural-network-based feature embedding. We demonstrate the competitive performance of our algorithm compared to the state-of-the-art in the few-training-sample setting with various real-data experiments.

preprint2022arXiv

Minimax Robust Quickest Change Detection using Wasserstein Ambiguity Sets

We study the robust quickest change detection under unknown pre- and post-change distributions. To deal with uncertainties in the data-generating distributions, we formulate two data-driven ambiguity sets based on the Wasserstein distance, without any parametric assumptions. The minimax robust test is constructed as the CUSUM test under least favorable distributions, a representative pair of distributions in the ambiguity sets. We show that the minimax robust test can be obtained in a tractable way and is asymptotically optimal. We investigate the effectiveness of the proposed robust test over existing methods, including the generalized likelihood ratio test and the robust test under KL divergence based ambiguity sets.

preprint2022arXiv

PERCEPT: a new online change-point detection method using topological data analysis

Topological data analysis (TDA) provides a set of data analysis tools for extracting embedded topological structures from complex high-dimensional datasets. In recent years, TDA has been a rapidly growing field which has found success in a wide range of applications, including signal processing, neuroscience and network analysis. In these applications, the online detection of changes is of crucial importance, but this can be highly challenging since such changes often occur in a low-dimensional embedding within high-dimensional data streams. We thus propose a new method, called PERsistence diagram-based ChangE-PoinT detection (PERCEPT), which leverages the learned topological structure from TDA to sequentially detect changes. PERCEPT follows two key steps: it first learns the embedded topology as a point cloud via persistence diagrams, then applies a non-parametric monitoring approach for detecting changes in the resulting point cloud distributions. This yields a non-parametric, topology-aware framework which can efficiently detect online changes from high-dimensional data streams. We investigate the effectiveness of PERCEPT over existing methods in a suite of numerical experiments where the data streams have an embedded topological structure. We then demonstrate the usefulness of PERCEPT in two applications in solar flare monitoring and human gesture detection.

preprint2022arXiv

Sequential change-point detection for mutually exciting point processes over networks

We present a new CUSUM procedure for sequentially detecting change-point in the self and mutual exciting processes, a.k.a. Hawkes networks using discrete events data. Hawkes networks have become a popular model for statistics and machine learning due to their capability in modeling irregularly observed data where the timing between events carries a lot of information. The problem of detecting abrupt changes in Hawkes networks arises from various applications, including neuronal imaging, sensor network, and social network monitoring. Despite this, there has not been a computationally and memory-efficient online algorithm for detecting such changes from sequential data. We present an efficient online recursive implementation of the CUSUM statistic for Hawkes processes, both decentralized and memory-efficient, and establish the theoretical properties of this new CUSUM procedure. We then show that the proposed CUSUM method achieves better performance than existing methods, including the Shewhart procedure based on count data, the generalized likelihood ratio (GLR) in the existing literature, and the standard score statistic. We demonstrate this via a simulated example and an application to population code change-detection in neuronal networks.

preprint2021arXiv

Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data

Recently, the Centers for Disease Control and Prevention (CDC) has worked with other federal agencies to identify counties with increasing coronavirus disease 2019 (COVID-19) incidence (hotspots) and offers support to local health departments to limit the spread of the disease. Understanding the spatio-temporal dynamics of hotspot events is of great importance to support policy decisions and prevent large-scale outbreaks. This paper presents a spatio-temporal Bayesian framework for early detection of COVID-19 hotspots (at the county level) in the United States. We assume both the observed number of cases and hotspots depend on a class of latent random variables, which encode the underlying spatio-temporal dynamics of the transmission of COVID-19. Such latent variables follow a zero-mean Gaussian process, whose covariance is specified by a non-stationary kernel function. The most salient feature of our kernel function is that deep neural networks are introduced to enhance the model's representative power while still enjoying the interpretability of the kernel. We derive a sparse model and fit the model using a variational learning strategy to circumvent the computational intractability for large data sets. Our model demonstrates better interpretability and superior hotspot-detection performance compared to other baseline methods.

preprint2021arXiv

Optimality of Graph Scanning Statistic for Online Community Detection

Sequential change-point detection for graphs is a fundamental problem for streaming network data types and has wide applications in social networks and power systems. Given fixed vertices and a sequence of random graphs, the objective is to detect the change-point where the underlying distribution of the random graph changes. In particular, we focus on the local change that only affects a subgraph. We adopt the classical Erdos-Renyi model and revisit the generalized likelihood ratio (GLR) detection procedure. The scan statistic is computed by sequentially estimating the most-likely subgraph where the change happens. We provide theoretical analysis for the asymptotic optimality of the proposed procedure based on the GLR framework. We demonstrate the efficiency of our detection algorithm using simulations.

preprint2021arXiv

Sequential Change Detection by Optimal Weighted $\ell_2$ Divergence

We present a new non-parametric statistic, called the weighed $\ell_2$ divergence, based on empirical distributions for sequential change detection. We start by constructing the weighed $\ell_2$ divergence as a fundamental building block for two-sample tests and change detection. The proposed statistic is proved to attain the optimal sample complexity in the offline setting. We then study the sequential change detection using the weighed $\ell_2$ divergence and characterize the fundamental performance metrics, including the average run length (ARL) and the expected detection delay (EDD). We also present practical algorithms to find the optimal projection to handle high-dimensional data and the optimal weights, which is critical to quick detection since, in such settings, there are not many post-change samples. Simulation results and real data examples are provided to validate the good performance of the proposed method.