Source author record

Xiaohui Xie

Xiaohui Xie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Information Retrieval Artificial Intelligence eess.IV math.OC Computation Computation and Language Human-Computer Interaction Information Theory math.IT math.NA Methodology Multimedia Networking and Internet Architecture Numerical Analysis Populations and Evolution

Catalog footprint

What is connected

21works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bayesian BiLO: Bilevel Local Operator Learning for Efficient Uncertainty Quantification of Bayesian PDE Inverse Problems with Low-Rank Adaptation

Uncertainty quantification in PDE inverse problems is essential in many applications. Scientific machine learning and AI enable data-driven learning of model components while preserving physical structure, and provide the scalability and adaptability needed for emerging imaging technologies and clinical insights. We develop a Bilevel Local Operator Learning framework for Bayesian inference in PDEs (B-BiLO). At the upper level, we sample parameters from the posterior via Hamiltonian Monte Carlo, while at the lower level we fine-tune a neural network via low-rank adaptation (LoRA) to approximate the solution operator locally. B-BiLO enables efficient gradient-based sampling without synthetic data or adjoint equations and avoids sampling in high-dimensional weight space, as in Bayesian neural networks, by optimizing weights deterministically. We analyze errors from approximate lower-level optimization and establish their impact on posterior accuracy. Numerical experiments across PDE models, including tumor growth, demonstrate that B-BiLO achieves accurate and efficient uncertainty quantification.

preprint2026arXiv

Bias in the Shadows: Explore Shortcuts in Encrypted Network Traffic Classification

Pre-trained models operating directly on raw bytes have achieved promising performance in encrypted network traffic classification (NTC), but often suffer from shortcut learning-relying on spurious correlations that fail to generalize to real-world data. Existing solutions heavily rely on model-specific interpretation techniques, which lack adaptability and generality across different model architectures and deployment scenarios. In this paper, we propose BiasSeeker, the first semi-automated framework that is both model-agnostic and data-driven for detecting dataset-specific shortcut features in encrypted traffic. By performing statistical correlation analysis directly on raw binary traffic, BiasSeeker identifies spurious or environment-entangled features that may compromise generalization, independent of any classifier. To address the diverse nature of shortcut features, we introduce a systematic categorization and apply category-specific validation strategies that reduce bias while preserving meaningful information. We evaluate BiasSeeker on 19 public datasets across three NTC tasks. By emphasizing context-aware feature selection and dataset-specific diagnosis, BiasSeeker offers a novel perspective for understanding and addressing shortcut learning in encrypted network traffic classification, raising awareness that feature selection should be an intentional and scenario-sensitive step prior to model training.

preprint2026arXiv

BiLO: Bilevel Local Operator Learning for PDE Inverse Problems

We propose a new neural network based method for solving inverse problems for partial differential equations (PDEs) by formulating the PDE inverse problem as a bilevel optimization problem. At the upper level, we minimize the data loss with respect to the PDE parameters. At the lower level, we train a neural network to locally approximate the PDE solution operator in the neighborhood of a given set of PDE parameters, which enables an accurate approximation of the descent direction for the upper level optimization problem. The lower level loss function includes the L2 norms of both the residual and its derivative with respect to the PDE parameters. We apply gradient descent simultaneously on both the upper and lower level optimization problems, leading to an effective and fast algorithm. The method, which we refer to as BiLO (Bilevel Local Operator learning), is also able to efficiently infer unknown functions in the PDEs through the introduction of an auxiliary variable. We provide a theoretical analysis that justifies our approach. Through extensive experiments over multiple PDE systems, we demonstrate that our method enforces strong PDE constraints, is robust to sparse and noisy data, and eliminates the need to balance the residual and the data loss, which is inherent to the soft PDE constraints in many existing methods.

preprint2026arXiv

UniAlign: A Model-Agnostic Framework for Robust Network Traffic Classification under Distribution Shifts

Network traffic classification (NTC) models often suffer severe performance degradation when deployed in real-world environments due to distribution shifts caused by changing network conditions. Existing robustness-enhancing approaches are commonly coupled to specific model architectures or data settings, fail to generalize to state-of-the-art raw-byte-based NTC models, or incur significant training overhead. In this paper, we propose UniAlign, a novel model-agnostic framework that improves the robustness of deep learning-based NTC models under distribution shifts. UniAlign combines \emph{domain alignment fine-tuning}, which encourages the learning of domain-invariant traffic representations across heterogeneous network conditions, with \emph{stable model ensembling}, which enhances inference robustness by aggregating checkpoints within a flat loss region. The framework can be seamlessly integrated into existing supervised NTC models without requiring specific feature modalities or introducing non-constant additional training costs. We evaluate UniAlign on three public datasets covering diverse distribution shifts, including encryption schemes, data collection devices, and attack behaviors. Experimental results on two representative NTC models demonstrate that, compared with standard training, UniAlign improves average classification accuracy by 2.51\% and average F1 score by 2.71\%, outperforming the strongest baseline by 1.45\% in accuracy and 1.69\% in F1 score, while requiring only 12.4\%--53.9\% of the training time of all NTC-specific baselines.

preprint2022arXiv

Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

With the growth of information on the Web, most users heavily rely on information access systems (e.g., search engines, recommender systems, etc.) in their daily lives. During this procedure, modeling users' satisfaction status plays an essential part in improving their experiences with the systems. In this paper, we aim to explore the benefits of using Electroencephalography (EEG) signals for satisfaction modeling in interactive information access system design. Different from existing EEG classification tasks, the arisen of satisfaction involves multiple brain functions, such as arousal, prototypicality, and appraisals, which are related to different brain topographical areas. Thus modeling user satisfaction raises great challenges to existing solutions. To address this challenge, we propose BTA, a Brain Topography Adaptive network with a multi-centrality encoding module and a spatial attention mechanism module to capture cognitive connectivities in different spatial distances. We explore the effectiveness of BTA for satisfaction modeling in two popular information access scenarios, i.e., search and recommendation. Extensive experiments on two real-world datasets verify the effectiveness of introducing brain topography adaptive strategy in satisfaction modeling. Furthermore, we also conduct search result re-ranking task and video rating prediction task based on the satisfaction inferred from brain signals on search and recommendation scenarios, respectively. Experimental results show that brain signals extracted with BTA help improve the performance of interactive information access systems significantly.

preprint2022arXiv

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always available. To address this problem, in this paper, we propose a novel DR framework named Disentangled Dense Retrieval (DDR) to support effective and flexible domain adaptation for DR models. DDR consists of a Relevance Estimation Module (REM) for modeling domain-invariant matching patterns and several Domain Adaption Modules (DAMs) for modeling domain-specific features of multiple target corpora. By making the REM and DAMs disentangled, DDR enables a flexible training paradigm in which REM is trained with supervision once and DAMs are trained with unsupervised data. Comprehensive experiments in different domains and languages show that DDR significantly improves ranking performance compared to strong DR baselines and substantially outperforms traditional retrieval methods in most scenarios.

preprint2022arXiv

Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models

A retrieval model should not only interpolate the training data but also extrapolate well to the queries that are different from the training data. While neural retrieval models have demonstrated impressive performance on ad-hoc search benchmarks, we still know little about how they perform in terms of interpolation and extrapolation. In this paper, we demonstrate the importance of separately evaluating the two capabilities of neural retrieval models. Firstly, we examine existing ad-hoc search benchmarks from the two perspectives. We investigate the distribution of training and test data and find a considerable overlap in query entities, query intent, and relevance labels. This finding implies that the evaluation on these test sets is biased toward interpolation and cannot accurately reflect the extrapolation capacity. Secondly, we propose a novel evaluation protocol to separately evaluate the interpolation and extrapolation performance on existing benchmark datasets. It resamples the training and test data based on query similarity and utilizes the resampled dataset for training and evaluation. Finally, we leverage the proposed evaluation protocol to comprehensively revisit a number of widely-adopted neural retrieval models. Results show models perform differently when moving from interpolation to extrapolation. For example, representation-based retrieval models perform almost as well as interaction-based retrieval models in terms of interpolation but not extrapolation. Therefore, it is necessary to separately evaluate both interpolation and extrapolation performance and the proposed resampling method serves as a simple yet effective evaluation tool for future IR studies.

preprint2022arXiv

Pre-training Methods in Information Retrieval

The core of information retrieval (IR) is to identify relevant information from large-scale resources and return it as a ranked list to respond to the user's information need. In recent years, the resurgence of deep learning has greatly advanced this field and leads to a hot topic named NeuIR (i.e., neural information retrieval), especially the paradigm of pre-training methods (PTMs). Owing to sophisticated pre-training objectives and huge model size, pre-trained models can learn universal language representations from massive textual data, which are beneficial to the ranking task of IR. Recently, a large number of works, which are dedicated to the application of PTMs in IR, have been introduced to promote the retrieval performance. Considering the rapid progress of this direction, this survey aims to provide a systematic review of pre-training methods in IR. To be specific, we present an overview of PTMs applied in different components of an IR system, including the retrieval component, the re-ranking component, and other components. In addition, we also introduce PTMs specifically designed for IR, and summarize available datasets as well as benchmark leaderboards. Moreover, we discuss some open challenges and highlight several promising directions, with the hope of inspiring and facilitating more works on these topics for future research.

preprint2022arXiv

Topology-Preserving Shape Reconstruction and Registration via Neural Diffeomorphic Flow

Deep Implicit Functions (DIFs) represent 3D geometry with continuous signed distance functions learned through deep neural nets. Recently DIFs-based methods have been proposed to handle shape reconstruction and dense point correspondences simultaneously, capturing semantic relationships across shapes of the same class by learning a DIFs-modeled shape template. These methods provide great flexibility and accuracy in reconstructing 3D shapes and inferring correspondences. However, the point correspondences built from these methods do not intrinsically preserve the topology of the shapes, unlike mesh-based template matching methods. This limits their applications on 3D geometries where underlying topological structures exist and matter, such as anatomical structures in medical images. In this paper, we propose a new model called Neural Diffeomorphic Flow (NDF) to learn deep implicit shape templates, representing shapes as conditional diffeomorphic deformations of templates, intrinsically preserving shape topologies. The diffeomorphic deformation is realized by an auto-decoder consisting of Neural Ordinary Differential Equation (NODE) blocks that progressively map shapes to implicit templates. We conduct extensive experiments on several medical image organ segmentation datasets to evaluate the effectiveness of NDF on reconstructing and aligning shapes. NDF achieves consistently state-of-the-art organ shape reconstruction and registration results in both accuracy and quality. The source code is publicly available at https://github.com/Siwensun/Neural_Diffeomorphic_Flow--NDF.

preprint2022arXiv

Towards a Better Understanding Human Reading Comprehension with Brain Signals

Reading comprehension is a complex cognitive process involving many human brain activities. Plenty of works have studied the patterns and attention allocations of reading comprehension in information retrieval related scenarios. However, little is known about what happens in human brain during reading comprehension and how these cognitive activities can affect information retrieval process. Additionally, with the advances in brain imaging techniques such as electroencephalogram (EEG), it is possible to collect brain signals in almost real time and explore whether it can be utilized as feedback to facilitate information acquisition performance. In this paper, we carefully design a lab-based user study to investigate brain activities during reading comprehension. Our findings show that neural responses vary with different types of reading contents, i.e., contents that can satisfy users' information needs and contents that cannot. We suggest that various cognitive activities, e.g., cognitive loading, semantic-thematic understanding, and inferential processing, underpin these neural responses at the micro-time scale during reading comprehension. From these findings, we illustrate several insights for information retrieval tasks, such as ranking models construction and interface design. Besides, we suggest the possibility of detecting reading comprehension status for a proactive real-world system. To this end, we propose a Unified framework for EEG-based Reading Comprehension Modeling (UERCM). To verify its effectiveness, we conduct extensive experiments based on EEG features for two reading comprehension tasks: answer sentence classification and answer extraction. Results show that it is feasible to improve the performance of two tasks with brain signals.

preprint2020arXiv

AttentionAnatomy: A unified framework for whole-body organs at risk segmentation using multiple partially annotated datasets

Organs-at-risk (OAR) delineation in computed tomography (CT) is an important step in Radiation Therapy (RT) planning. Recently, deep learning based methods for OAR delineation have been proposed and applied in clinical practice for separate regions of the human body (head and neck, thorax, and abdomen). However, there are few researches regarding the end-to-end whole-body OARs delineation because the existing datasets are mostly partially or incompletely annotated for such task. In this paper, our proposed end-to-end convolutional neural network model, called \textbf{AttentionAnatomy}, can be jointly trained with three partially annotated datasets, segmenting OARs from whole body. Our main contributions are: 1) an attention module implicitly guided by body region label to modulate the segmentation branch output; 2) a prediction re-calibration operation, exploiting prior information of the input images, to handle partial-annotation(HPA) problem; 3) a new hybrid loss function combining batch Dice loss and spatially balanced focal loss to alleviate the organ size imbalance problem. Experimental results of our proposed framework presented significant improvements in both Sørensen-Dice coefficient (DSC) and 95\% Hausdorff distance compared to the baseline model.

preprint2020arXiv

Dynamically Pruned Message Passing Networks for Large-Scale Knowledge Graph Reasoning

We propose Dynamically Pruned Message Passing Networks (DPMPN) for large-scale knowledge graph reasoning. In contrast to existing models, embedding-based or path-based, we learn an input-dependent subgraph to explicitly model reasoning process. Subgraphs are dynamically constructed and expanded by applying graphical attention mechanism conditioned on input queries. In this way, we not only construct graph-structured explanations but also enable message passing designed in Graph Neural Networks (GNNs) to scale with graph sizes. We take the inspiration from the consciousness prior proposed by and develop a two-GNN framework to simultaneously encode input-agnostic full graph representation and learn input-dependent local one coordinated by an attention module. Experiments demonstrate the reasoning capability of our model that is to provide clear graphical explanations as well as deliver accurate predictions, outperforming most state-of-the-art methods in knowledge base completion tasks.

preprint2020arXiv

Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation

Hand pose estimation is more challenging than body pose estimation due to severe articulation, self-occlusion and high dexterity of the hand. Current approaches often rely on a popular body pose algorithm, such as the Convolutional Pose Machine (CPM), to learn 2D keypoint features. These algorithms cannot adequately address the unique challenges of hand pose estimation, because they are trained solely based on keypoint positions without seeking to explicitly model structural relationship between them. We propose a novel Nonparametric Structure Regularization Machine (NSRM) for 2D hand pose estimation, adopting a cascade multi-task architecture to learn hand structure and keypoint representations jointly. The structure learning is guided by synthetic hand mask representations, which are directly computed from keypoint positions, and is further strengthened by a novel probabilistic representation of hand limbs and an anatomically inspired composition strategy of mask synthesis. We conduct extensive studies on two public datasets - OneHand 10k and CMU Panoptic Hand. Experimental results demonstrate that explicitly enforcing structure learning consistently improves pose estimation accuracy of CPM baseline models, by 1.17% on the first dataset and 4.01% on the second one. The implementation and experiment code is freely available online. Our proposal of incorporating structural learning to hand pose estimation requires no additional training information, and can be a generic add-on module to other pose estimation models.

preprint2020arXiv

Rotation-invariant Mixed Graphical Model Network for 2D Hand Pose Estimation

In this paper, we propose a new architecture named Rotation-invariant Mixed Graphical Model Network (R-MGMN) to solve the problem of 2D hand pose estimation from a monocular RGB image. By integrating a rotation net, the R-MGMN is invariant to rotations of the hand in the image. It also has a pool of graphical models, from which a combination of graphical models could be selected, conditioning on the input image. Belief propagation is performed on each graphical model separately, generating a set of marginal distributions, which are taken as the confidence maps of hand keypoint positions. Final confidence maps are obtained by aggregating these confidence maps together. We evaluate the R-MGMN on two public hand pose datasets. Experiment results show our model outperforms the state-of-the-art algorithm which is widely used in 2D hand pose estimation by a noticeable margin.

preprint2016arXiv

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.

preprint2016arXiv

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods requires great effort to annotate the training data by costly manual labeling and specialized computational models to detect these annotations during test. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned costly need to annotate the training data. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed deep networks compared to previous work using segmentation and detection annotations in the training.

preprint2013arXiv

Inference in Kingman's Coalescent with Particle Markov Chain Monte Carlo Method

We propose a new algorithm to do posterior sampling of Kingman's coalescent, based upon the Particle Markov Chain Monte Carlo methodology. Specifically, the algorithm is an instantiation of the Particle Gibbs Sampling method, which alternately samples coalescent times conditioned on coalescent tree structures, and tree structures conditioned on coalescent times via the conditional Sequential Monte Carlo procedure. We implement our algorithm as a C++ package, and demonstrate its utility via a parameter estimation task in population genetics on both single- and multiple-locus data. The experiment results show that the proposed algorithm performs comparable to or better than several well-developed methods.

preprint2011arXiv

Efficient Latent Variable Graphical Model Selection via Split Bregman Method

We consider the problem of covariance matrix estimation in the presence of latent variables. Under suitable conditions, it is possible to learn the marginal covariance matrix of the observed variables via a tractable convex program, where the concentration matrix of the observed variables is decomposed into a sparse matrix (representing the graphical structure of the observed variables) and a low rank matrix (representing the marginalization effect of latent variables). We present an efficient first-order method based on split Bregman to solve the convex problem. The algorithm is guaranteed to converge under mild conditions. We show that our algorithm is significantly faster than the state-of-the-art algorithm on both artificial and real-world data. Applying the algorithm to a gene expression data involving thousands of genes, we show that most of the correlation between observed variables can be explained by only a few dozen latent factors.

preprint2010arXiv

Learning sparse gradients for variable selection and dimension reduction

Variable selection and dimension reduction are two commonly adopted approaches for high-dimensional data analysis, but have traditionally been treated separately. Here we propose an integrated approach, called sparse gradient learning (SGL), for variable selection and dimension reduction via learning the gradients of the prediction function directly from samples. By imposing a sparsity constraint on the gradients, variable selection is achieved by selecting variables corresponding to non-zero partial derivatives, and effective dimensions are extracted based on the eigenvectors of the derived sparse empirical gradient covariance matrix. An error analysis is given for the convergence of the estimated gradients to the true ones in both the Euclidean and the manifold setting. We also develop an efficient forward-backward splitting algorithm to solve the SGL problem, making the framework practically scalable for medium or large datasets. The utility of SGL for variable selection and feature extraction is explicitly given and illustrated on artificial data as well as real-world examples. The main advantages of our method include variable selection for both linear and nonlinear predictions, effective dimension reduction with sparse loadings, and an efficient algorithm for large p, small n problems.

preprint2010arXiv

Split Bregman method for large scale fused Lasso

rdering of regression or classification coefficients occurs in many real-world applications. Fused Lasso exploits this ordering by explicitly regularizing the differences between neighboring coefficients through an $\ell_1$ norm regularizer. However, due to nonseparability and nonsmoothness of the regularization term, solving the fused Lasso problem is computationally demanding. Existing solvers can only deal with problems of small or medium size, or a special case of the fused Lasso problem in which the predictor matrix is identity matrix. In this paper, we propose an iterative algorithm based on split Bregman method to solve a class of large-scale fused Lasso problems, including a generalized fused Lasso and a fused Lasso support vector classifier. We derive our algorithm using augmented Lagrangian method and prove its convergence properties. The performance of our method is tested on both artificial data and real-world applications including proteomic data from mass spectrometry and genomic data from array CGH. We demonstrate that our method is many times faster than the existing solvers, and show that it is especially efficient for large p, small n problems.

preprint2010arXiv

Split Bregman Method for Sparse Inverse Covariance Estimation with Matrix Iteration Acceleration

We consider the problem of estimating the inverse covariance matrix by maximizing the likelihood function with a penalty added to encourage the sparsity of the resulting matrix. We propose a new approach based on the split Bregman method to solve the regularized maximum likelihood estimation problem. We show that our method is significantly faster than the widely used graphical lasso method, which is based on blockwise coordinate descent, on both artificial and real-world data. More importantly, different from the graphical lasso, the split Bregman based method is much more general, and can be applied to a class of regularization terms other than the $\ell_1$ norm

Xiaohui Xie

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Bayesian BiLO: Bilevel Local Operator Learning for Efficient Uncertainty Quantification of Bayesian PDE Inverse Problems with Low-Rank Adaptation

Bias in the Shadows: Explore Shortcuts in Encrypted Network Traffic Classification

BiLO: Bilevel Local Operator Learning for PDE Inverse Problems

UniAlign: A Model-Agnostic Framework for Robust Network Traffic Classification under Distribution Shifts

Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models

Pre-training Methods in Information Retrieval

Topology-Preserving Shape Reconstruction and Registration via Neural Diffeomorphic Flow

Towards a Better Understanding Human Reading Comprehension with Brain Signals

AttentionAnatomy: A unified framework for whole-body organs at risk segmentation using multiple partially annotated datasets

Dynamically Pruned Message Passing Networks for Large-Scale Knowledge Graph Reasoning

Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation

Rotation-invariant Mixed Graphical Model Network for 2D Hand Pose Estimation

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

Inference in Kingman's Coalescent with Particle Markov Chain Monte Carlo Method

Efficient Latent Variable Graphical Model Selection via Split Bregman Method

Learning sparse gradients for variable selection and dimension reduction

Split Bregman method for large scale fused Lasso

Split Bregman Method for Sparse Inverse Covariance Estimation with Matrix Iteration Acceleration