Researcher profile

Xiaoyu Liu

Xiaoyu Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

Delulu: A Verified Multi-Lingual Benchmark for Code Hallucination Detection in Fill-in-the-Middle Tasks

Large Language Models for code generation frequently produce hallucinations in Fill-in-the-Middle (FIM) tasks -- plausible but incorrect completions such as invented API methods, invalid parameters, undefined variables, or non-existent imports. These failures pass superficial review yet introduce runtime errors. We introduce Delulu, a verified multi-lingual benchmark of 1,951 FIM samples across 7 languages and 4 hallucination types. Samples are curated through an adversarial pipeline: a frontier LLM generates plausible hallucinations, four diverse judge models evaluate them, embedding-based clustering mines progressively harder examples, self-contained Docker containers verify that golden completions compile while hallucinated variants produce the expected runtime error, and a final human-expert review removes any remaining biased or trivially decidable samples. We evaluate 11 open-weight FIM models from five families spanning 0.5B-32B parameters: a six-point Qwen2.5-Coder scaling slate, plus a cross-family slate (CodeLlama, DeepSeek-Coder-V2, StarCoder2). The strongest model reaches only 84.5% pass@1, no family exceeds 0.77 Edit Similarity, and every family produces hallucination-aligned completions on a non-trivial share of samples, confirming that the difficulty exposed by Delulu is task-intrinsic rather than family-specific. We release the benchmark, containers, and evaluation framework at https://github.com/microsoft/delulu.

preprint2026arXiv

HUR-MACL: High-Uncertainty Region-Guided Multi-Architecture Collaborative Learning for Head and Neck Multi-Organ Segmentation

Accurate segmentation of organs at risk in the head and neck is essential for radiation therapy, yet deep learning models often fail on small, complexly shaped organs. While hybrid architectures that combine different models show promise, they typically just concatenate features without exploiting the unique strengths of each component. This results in functional overlap and limited segmentation accuracy. To address these issues, we propose a high uncertainty region-guided multi-architecture collaborative learning (HUR-MACL) model for multi-organ segmentation in the head and neck. This model adaptively identifies high uncertainty regions using a convolutional neural network, and for these regions, Vision Mamba as well as Deformable CNN are utilized to jointly improve their segmentation accuracy. Additionally, a heterogeneous feature distillation loss was proposed to promote collaborative learning between the two architectures in high uncertainty regions to further enhance performance. Our method achieves SOTA results on two public datasets and one private dataset.

preprint2026arXiv

Single-index Semiparametric Transformation Cure Models with Interval-censored Data

Interval censored data commonly arise in medical studies when the event time of interest is only known to lie within an interval. In the presence of a cure subgroup, conventional mixture cure models typically assume a logistic model for the uncure probability and a proportional hazards model for the susceptible subjects. However, in practice, the assumptions of parametric form for the uncure probability and the proportional hazards model for the susceptible may not always be satisfied. In this paper, we propose a class of flexible single-index semiparametric transformation cure models for interval-censored data, where a single-index model and a semiparametric transformation model are utilized for the uncured and conditional survival probability, respectively, encompassing both the proportional hazards cure and proportional odds cure models as specific cases. We approximate the single-index function and cumulative baseline hazard functions via the kernel technique and splines, respectively, and develop a computationally feasible expectation-maximisation (EM) algorithm, facilitated by a four-layer gamma-frailty Poisson data augmentation. Simulation studies demonstrate the satisfactory performance of our proposed method, compared to the spline-based approach and the classical logistic-based mixture cure models. The application of the proposed methodology is illustrated using the Alzheimers dataset.

preprint2024arXiv

conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks

Modern ConvNets continue to achieve state-of-the-art results over a vast array of vision and image classification tasks, but at the cost of increasing parameters. One strategy for compactifying a network without sacrificing much expressive power is to reshape it into a tensorial neural network (TNN), which is a higher-order tensorization of its layers, followed by a factorization, such as a CP-decomposition, which strips a weight down to its critical basis components. Passes through TNNs can be represented as sequences of multilinear operations (MLOs), where the evaluation path can greatly affect the number of floating point operations (FLOPs) incurred. While functions such as the popular einsum can evaluate simple MLOs such as contractions, existing implementations cannot process multi-way convolutions, resulting in scant assessments of how optimal evaluation paths through tensorized convolutional layers can improve training speed. In this paper, we develop a unifying framework for representing tensorial convolution layers as einsum-like strings and a meta-algorithm conv_einsum which is able to evaluate these strings in a FLOPs-minimizing manner. Comprehensive experiments, using our open-source implementation, over a wide range of models, tensor decompositions, and diverse tasks, demonstrate that conv_einsum significantly increases both computational and memory-efficiency of convolutional TNNs.

preprint2023arXiv

Improving photon number resolvability of a superconducting nanowire detector array using a level comparator circuit

Photon number resolving (PNR) capability is very important in many optical applications, including quantum information processing, fluorescence detection, and few-photon-level ranging and imaging. Superconducting nanowire single-photon detectors (SNSPDs) with a multipixel interleaved architecture give the array an excellent spatial PNR capability. However, the signal-to-noise ratio (SNR) of the photon number resolution (SNRPNR) of the array will be degraded with increasing the element number due to the electronic noise in the readout circuit, which limits the PNR resolution as well as the maximum PNR number. In this study, a 16-element interleaved SNSPD array was fabricated, and the PNR capability of the array was investigated and analyzed. By introducing a level comparator circuit (LCC), the SNRPNR of the detector array was improved over a factor of four. In addition, we performed a statistical analysis of the photon number on this SNSPD array with LCC, showing that the LCC method effectively enhances the PNR resolution. Besides, the system timing jitter of the detector was reduced from 90 ps to 72 ps due to the improved electrical SNR.

preprint2022arXiv

Advanced Deep Networks for 3D Mitochondria Instance Segmentation

Mitochondria instance segmentation from electron microscopy (EM) images has seen notable progress since the introduction of deep learning methods. In this paper, we propose two advanced deep networks, named Res-UNet-R and Res-UNet-H, for 3D mitochondria instance segmentation from Rat and Human samples. Specifically, we design a simple yet effective anisotropic convolution block and deploy a multi-scale training strategy, which together boost the segmentation performance. Moreover, we enhance the generalizability of the trained models on the test set by adding a denoising operation as pre-processing. In the Large-scale 3D Mitochondria Instance Segmentation Challenge at ISBI 2021, our method ranks the 1st place. Code is available at https://github.com/Limingxing00/MitoEM2021-Challenge.

preprint2022arXiv

Learning to Reduce False Positives in Analytic Bug Detectors

Due to increasingly complex software design and rapid iterative development, code defects and security vulnerabilities are prevalent in modern software. In response, programmers rely on static analysis tools to regularly scan their codebases and find potential bugs. In order to maximize coverage, however, these tools generally tend to report a significant number of false positives, requiring developers to manually verify each warning. To address this problem, we propose a Transformer-based learning approach to identify false positive bug warnings. We demonstrate that our models can improve the precision of static analysis by 17.5%. In addition, we validated the generalizability of this approach across two major bug types: null dereference and resource leak.

preprint2022arXiv

Long-Tail Prediction Uncertainty Aware Trajectory Planning for Self-driving Vehicles

A typical trajectory planner of autonomous driving commonly relies on predicting the future behavior of surrounding obstacles. Recently, deep learning technology has been widely adopted to design prediction models due to their impressive performance. However, such models may fail in the "long-tail" driving cases where the training data is sparse or unavailable, leading to planner failures. To this end, this work proposes a trajectory planner to consider the prediction model uncertainty arising from insufficient data for safer performance. Firstly, an ensemble network structure estimates the prediction model's uncertainty due to insufficient training data. Then a trajectory planner is designed to consider the worst-case arising from prediction uncertainty. The results show that the proposed method can improve the safety of trajectory planning under the prediction uncertainty caused by insufficient data. At the same time, with sufficient data, the framework will not lead to overly conservative results. This technology helps to improve the safety and reliability of autonomous vehicles under the long-tail data distribution of the real world.

preprint2022arXiv

Occluded Video Instance Segmentation: A Benchmark

Can our video understanding systems perceive objects when a heavy occlusion exists in a scene? To answer this question, we collect a large-scale dataset called OVIS for occluded video instance segmentation, that is, to simultaneously detect, segment, and track instances in occluded scenes. OVIS consists of 296k high-quality instance masks from 25 semantic categories, where object occlusions usually occur. While our human vision systems can understand those occluded instances by contextual reasoning and association, our experiments suggest that current video understanding systems cannot. On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16.3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario. We also present a simple plug-and-play module that performs temporal feature calibration to complement missing object cues caused by occlusion. Built upon MaskTrack R-CNN and SipMask, we obtain a remarkable AP improvement on the OVIS dataset. The OVIS dataset and project code are available at http://songbai.site/ovis .

preprint2022arXiv

Theoretical analysis of single-ion anisotropy in $d^3$ Mott insulators

An effective spin model for Mott insulators is determined by the symmetries involved among magnetic sites, electron fillings, and their interactions. Such a spin Hamiltonian offers insight to mechanisms of magnetic orders and magnetic anisotropy beyond the Heisenberg model. For a spin moment S bigger than 1/2, single-ion anisotropy is in principle allowed. However, for $d^3$ Mott insulators with large cubic crystal field splitting, the single-ion anisotropy is absent within the LS coupling, despite S = 3/2 local moment. On the other hand, preferred magnetic moment directions in $d^3$ materials have been reported, which calls for a further theoretical investigation. Here we derive the single-ion anisotropy interaction using the strong-coupling perturbation theory. The cubic crystal field splitting including $e_g$ orbitals, trigonal distortions, Hund's coupling, and spin-orbit coupling beyond the LS scheme are taken into account. For compressed distortion, the spin-orbit coupling at magnetic sites can favor either the easy-axis or the easy-plane while that of anions leads to easy-axis anisotropy. We apply the theory on $\rm{CrX}_3$ with X = Cl and I, and show the dependence of the single-ion anisotropy on the strength of the spin-orbit couplings of both magnetic and anion sites. Significance of the single-ion anisotropy in ideal two-dimensional magnets is also discussed.

preprint2020arXiv

An empirical study of Conv-TasNet

Conv-TasNet is a recently proposed waveform-based deep neural network that achieves state-of-the-art performance in speech source separation. Its architecture consists of a learnable encoder/decoder and a separator that operates on top of this learned space. Various improvements have been proposed to Conv-TasNet. However, they mostly focus on the separator, leaving its encoder/decoder as a (shallow) linear operator. In this paper, we conduct an empirical study of Conv-TasNet and propose an enhancement to the encoder/decoder that is based on a (deep) non-linear variant of it. In addition, we experiment with the larger and more diverse LibriTTS dataset and investigate the generalization capabilities of the studied models when trained on a much larger dataset. We propose cross-dataset evaluation that includes assessing separations from the WSJ0-2mix, LibriTTS and VCTK databases. Our results show that enhancements to the encoder/decoder can improve average SI-SNR performance by more than 1 dB. Furthermore, we offer insights into the generalization capabilities of Conv-TasNet and the potential value of improvements to the encoder/decoder.

preprint2020arXiv

Efficient Receive Beamformers for Secure Spatial Modulation against a Malicious Full-duplex Attacker with Eavesdropping Ability

In this paper, we consider a new secure spatial modulation scenario with a full-duplex (FD) malicious attacker Mallory owning eavesdropping capacity, where Mallory works on FD model and transmits a malicious jamming such as artificial noise (AN) to interfere with Bob. To suppress the malicious jamming on Bob from Mallory, a conventional maximum receive power (Max-RP) at Bob is presented firstly. Subsequently, to exploit the colored property of noise plus interference at Bob, a whitening-filter-based Max-RP (Max-WFRP) is proposed with an obvious performance enhancement over Max-RP. To completely remove the malicious jamming from Mallory, a Max-RP with a constraint of forcing the malicious jamming from Mallory to zero at Bob is proposed. To further improve secrecy rate (SR) by removing the ZF contraint (ZFC), the maximum signal-to-jamming-plus-noise ratio (Max-SJNR) is proposed. Our proposed methods have closed-form expressions. From simulation results, the four receive beamforming methods have an increasing order in performance: Max-RP, Max-RP with ZFC and Max-SJNR$\approx$Max-WFRP. Additionally, the latter two harvest a substantial performance gains over Max-RP and Max-RP with ZFC in the low and medium signal-to-noise ratio regions.

preprint2020arXiv

Ensemble Wrapper Subsampling for Deep Modulation Classification

Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms that rely on the output samples. We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems. Unlike traditional approaches that rely on pre-designed strategies that are solely based on expert knowledge, the proposed data-driven subsampling strategy employs deep neural network architectures to simulate the effect of removing candidate combinations of samples from each training input vector, in a manner inspired by how wrapper feature selection models work. The subsampled data is then processed by another deep learning classifier that recognizes each of the considered 10 modulation types. We show that the proposed subsampling strategy not only introduces drastic reduction in the classifier training time, but can also improve the classification accuracy to higher levels than those reached before for the considered dataset. An important feature herein is exploiting the transferability property of deep neural networks to avoid retraining the wrapper models and obtain superior performance through an ensemble of wrappers over that possible through solely relying on any of them.

preprint2020arXiv

PairDiag: an exact diagonalization program for solving general pairing Hamiltonians

We present a program for solving exactly the general pairing Hamiltonian based on diagonalization. The program generates the seniority-zero shell-model-like basis vectors via the `01' inversion algorithm. The Hamiltonian matrix is constructed in that seniority-zero space. The program evaluates all non-zero elements of the Hamiltonian matrix "on the fly" using the scattering operator and the search algorithm that act on the generated basis. The matrix is diagonalized by using the iterative Lanczos algorithm. The program thus developed, PairDiag, can calculate efficiently the ground-state eigenvalue and eigenvector of any pairing Hamiltonian. The program can be easily implemented to replace the BCS approximation in standard self-consistent mean-field calculations. The code is parallelized using OpenMP. For larger systems with dimension around 10$^{8-9}$, the calculation can be done within a day on standard desktop computers.

preprint2020arXiv

Precoding and Transmit Antenna Subarray Selection for Secure Hybrid Spatial Modulation

Spatial modulation (SM) is a particularly important form of multiple-input-multiple-output (MIMO). Unlike traditional MIMO, it uses both modulation symbols and antenna indices to carry information. In this paper, to avoid the high cost and circuit complexity of fully-digital SM, we mainly consider the hybrid SM system with a hybrid precoding transmitter architecture, combining a digital precoder and an analog precoder. Here, the partially-connected structure is adopted with each radio frequency chain (RF) being connected to a transmit antenna subarray (TAS). In such a system, we made an investigation of secure hybrid precoding and transmit antenna subarray selection (TASS) methods. Two hybrid precoding methods, called maximizing the approximate secrecy rate (SR) via gradient ascent (Max-ASR-GA) and maximizing the approximate SR via alternating direction method of multipliers (Max-ASR-ADMM), are proposed to improve the SR performance. As for TASS, a high-performance method of maximizing the approximate SR (Max-ASR) TASS method is first presented. To reduce its high complexity, two low-complexity TASS methods, namely maximizing the eigenvalue (Max-EV) and maximizing the product of signal-to-interference-plus-noise ratio and artificial noise-to-signal-plus-noise ratio (Max-P-SINR-ANSNR), are proposed. Simulation results will demonstrate that the proposed Max-ASR-GA and Max-ASR-ADMM hybrid precoders harvest substantial SR performance gains over existing method. For TASS, the proposed three methods Max-ASR, Max-EV, and Max-P-SINR-ANSNR perform better than existing leakage method. Particularly, the proposed Max-EV and Max-P-SINR-ANSNR is low-complexity at the expense of a little performance loss compared with Max-ASR.