Source author record

Yuxuan Xia

Yuxuan Xia appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.SP Applications Artificial Intelligence eess.SY Machine Learning Systems and Control eess.AS Sound

Catalog footprint

What is connected

11works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Parallel Prefix Verification for Speculative Generation

We introduce PARSE (PArallel pRefix Speculative Engine), a speculative generation framework that accelerates large language model (LLM) inference by parallelizing prefix verification on a semantic level. Existing speculative decoding methods are fundamentally limited by token-level equivalence: the target model must verify each token, leading to short acceptance lengths and modest speedups. Moving to semantic or segment-level verification can substantially increase acceptance granularity, but prior approaches rely on sequential verification, introducing significant overhead and limiting practical gains. PARSE introduces parallel prefix verification, enabling semantic-level verification without sequential checks. Given a full draft from a draft model, the target model evaluates correctness across multiple prefixes in a single forward pass using a custom attention mask, directly identifying the maximal valid prefix. This eliminates sequential segment verification, and makes verification compute-efficient. PARSE is orthogonal to token-level speculative decoding and can be composed with it for additional gains. Across models and benchmarks, PARSE delivers $1.25\times$ to $4.3\times$ throughput gain over the target model, and $1.6\times$ to $4.5\times$ when composed with EAGLE-3, all with negligible accuracy degradation. This demonstrates parallel prefix verification as an effective, general approach to accelerating LLM inference.

preprint2026arXiv

SDCD: Structure-Disrupted Contrastive Decoding for Mitigating Hallucinations in Large Vision-Language Models

Large Vision-Language Models (LVLMs) demonstrate significant progress in multimodal understanding and reasoning, yet object hallucination remains a critical challenge. While existing research focuses on mitigating language priors or high-level statistical biases, they often overlook the internal complexities of the visual encoding process. We identify that visual statistical bias, arising from the inherent Bag-of-Patches behavior of Vision Encoders under weak structural supervision, acts as a contributing factor of object hallucinations. Under this bias, models prioritize local texture features within individual patches over holistic geometric structures. This tendency may induce spurious visual confidence and result in hallucinations. To address this, we introduce a training-free algorithm called Structure-Disrupted Contrastive Decoding (SDCD), which performs contrastive calibration of the output distribution by introducing a shuffled structure-disrupted view. By penalizing tokens that maintain high confidence under this structure-less view, SDCD effectively suppresses the texture-driven bias. Experimental results demonstrate that SDCD significantly mitigates hallucinations across multiple benchmarks and enhances the overall multimodal capabilities of LVLMs.

preprint2026arXiv

The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge

This paper summarizes the ICASSP 2026 Automatic Song Aesthetics Evaluation (ASAE) Challenge, which focuses on predicting the subjective aesthetic scores of AI-generated songs. The challenge consists of two tracks: Track 1 targets the prediction of the overall musicality score, while Track 2 focuses on predicting five fine-grained aesthetic scores. The challenge attracted strong interest from the research community and received numerous submissions from both academia and industry. Top-performing systems significantly surpassed the official baseline, demonstrating substantial progress in aligning objective metrics with human aesthetic preferences. The outcomes establish a standardized benchmark and advance human-aligned evaluation methodologies for modern music generation systems.

preprint2023arXiv

Transformer-Based Multi-Object Smoothing with Decoupled Data Association and Smoothing

Multi-object tracking (MOT) is the task of estimating the state trajectories of an unknown and time-varying number of objects over a certain time window. Several algorithms have been proposed to tackle the multi-object smoothing task, where object detections can be conditioned on all the measurements in the time window. However, the best-performing methods suffer from intractable computational complexity and require approximations, performing suboptimally in complex settings. Deep learning based algorithms are a possible venue for tackling this issue but have not been applied extensively in settings where accurate multi-object models are available and measurements are low-dimensional. We propose a novel DL architecture specifically tailored for this setting that decouples the data association task from the smoothing task. We compare the performance of the proposed smoother to the state-of-the-art in different tasks of varying difficulty and provide, to the best of our knowledge, the first comparison between traditional Bayesian trackers and DL trackers in the smoothing problem setting.

preprint2022arXiv

A comparison between PMBM Bayesian track initiation and labelled RFS adaptive birth

This paper provides a comparative analysis between the adaptive birth model used in the labelled random finite set literature and the track initiation in the Poisson multi-Bernoulli mixture (PMBM) filter, with point-target models. The PMBM track initiation is obtained via Bayes' rule applied on the predicted PMBM density, and creates one Bernoulli component for each received measurement, representing that this measurement may be clutter or a detection from a new target. Adaptive birth mimics this procedure by creating a Bernoulli component for each measurement using a different rule to determine the probability of existence and a user-defined single-target density. This paper first provides an analysis of the differences that arise in track initiation based on isolated measurements. Then, it shows that adaptive birth underestimates the number of objects present in the surveillance area under common modelling assumptions. Finally, we provide numerical simulations to further illustrate the differences.

preprint2022arXiv

Can Deep Learning be Applied to Model-Based Multi-Object Tracking?

Multi-object tracking (MOT) is the problem of tracking the state of an unknown and time-varying number of objects using noisy measurements, with important applications such as autonomous driving, tracking animal behavior, defense systems, and others. In recent years, deep learning (DL) has been increasingly used in MOT for improving tracking performance, but mostly in settings where the measurements are high-dimensional and there are no available models of the measurement likelihood and the object dynamics. The model-based setting instead has not attracted as much attention, and it is still unclear if DL methods can outperform traditional model-based Bayesian methods, which are the state of the art (SOTA) in this context. In this paper, we propose a Transformer-based DL tracker and evaluate its performance in the model-based setting, comparing it to SOTA model-based Bayesian methods in a variety of different tasks. Our results show that the proposed DL method can match the performance of the model-based methods in simple tasks, while outperforming them when the task gets more complicated, either due to an increase in the data association complexity, or to stronger nonlinearities of the models of the environment.

preprint2022arXiv

Multiple Object Trajectory Estimation Using Backward Simulation

This paper presents a general solution for computing the multi-object posterior for sets of trajectories from a sequence of multi-object (unlabelled) filtering densities and a multi-object dynamic model. Importantly, the proposed solution opens an avenue of trajectory estimation possibilities for multi-object filters that do not explicitly estimate trajectories. In this paper, we first derive a general multi-trajectory backward smoothing equation based on random finite sets of trajectories. Then we show how to sample sets of trajectories using backward simulation for Poisson multi-Bernoulli filtering densities, and develop a tractable implementation based on ranked assignment. The performance of the resulting multi-trajectory particle smoothers is evaluated in a simulation study, and the results demonstrate that they have superior performance in comparison to several state-of-the-art multi-object filters and smoothers.

preprint2021arXiv

Backward Simulation for Sets of Trajectories

This paper presents a solution for recovering full trajectory information, via the calculation of the posterior of the set of trajectories, from a sequence of multitarget (unlabelled) filtering densities and the multitarget dynamic model. Importantly, the proposed solution opens an avenue of trajectory estimation possibilities for multitarget filters that do not explicitly estimate trajectories. In this paper, we first derive a general multitrajectory forward-backward smoothing equation based on sets of trajectories and the random finite set framework. Then we show how to sample sets of trajectories using backward simulation when the multitarget filtering densities are multi-Bernoulli processes. The proposed approach is demonstrated in a simulation study.

preprint2020arXiv

Multi-Scan Implementation of the Trajectory Poisson Multi-Bernoulli Mixture Filter

The Poisson multi-Bernoulli mixture (PMBM) and the multi-Bernoulli mixture (MBM) are two multi-target distributions for which closed-form filtering recursions exist. The PMBM has a Poisson birth process, whereas the MBM has a multi-Bernoulli birth process. This paper considers a recently developed formulation of the multi-target tracking problem using a random finite set of trajectories, through which the track continuity is explicitly established. A multi-scan trajectory PMBM filter and a multi-scan trajectory MBM filter, with the ability to correct past data association decisions to improve current decisions, are presented. In addition, a multi-scan trajectory $\text{MBM}_{01}$ filter, in which the existence probabilities of all Bernoulli components are either 0 or 1, is presented. This paper proposes an efficient implementation that performs track-oriented $N$-scan pruning to limit computational complexity, and uses dual decomposition to solve the involved multi-frame assignment problem. The performance of the presented multi-target trackers, applied with an efficient fixed-lag smoothing method, are evaluated in a simulation study.

preprint2020arXiv

Spatiotemporal Constraints for Sets of Trajectories with Applications to PMBM Densities

In this paper we introduce spatiotemporal constraints for trajectories, i.e., restrictions that the trajectory must be in some part of the state space (spatial constraint) at some point in time (temporal constraint). Spatiotemporal contraints on trajectories can be used to answer a range of important questions, including, e.g., "where did the person that were in area A at time t, go afterwards?". We discuss how multiple constraints can be combined into sets of constraints, and we then apply sets of constraints to set of trajectories densities, specifically Poisson Multi-Bernoulli Mixture (PMBM) densities. For Poisson target birth, the exact posterior density is PMBM for both point targets and extended targets. In the paper we show that if the unconstrained set of trajectories density is PMBM, then the constrained density is also PMBM. Examples of constrained trajectory densities motivate and illustrate the key results.

preprint2020arXiv

Trajectory Poisson multi-Bernoulli filters

This paper presents two trajectory Poisson multi-Bernoulli (TPMB) filters for multi-target tracking: one to estimate the set of alive trajectories at each time step and another to estimate the set of all trajectories, which includes alive and dead trajectories, at each time step. The filters are based on propagating a Poisson multi-Bernoulli (PMB) density on the corresponding set of trajectories through the filtering recursion. After the update step, the posterior is a PMB mixture (PMBM) so, in order to obtain a PMB density, a Kullback-Leibler divergence minimisation on an augmented space is performed. The developed filters are computationally lighter alternatives to the trajectory PMBM filters, which provide the closed-form recursion for sets of trajectories with Poisson birth model, and are shown to outperform previous multi-target tracking algorithms.

Yuxuan Xia

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Parallel Prefix Verification for Speculative Generation

SDCD: Structure-Disrupted Contrastive Decoding for Mitigating Hallucinations in Large Vision-Language Models

The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge

Transformer-Based Multi-Object Smoothing with Decoupled Data Association and Smoothing

A comparison between PMBM Bayesian track initiation and labelled RFS adaptive birth

Can Deep Learning be Applied to Model-Based Multi-Object Tracking?

Multiple Object Trajectory Estimation Using Backward Simulation

Backward Simulation for Sets of Trajectories

Multi-Scan Implementation of the Trajectory Poisson Multi-Bernoulli Mixture Filter

Spatiotemporal Constraints for Sets of Trajectories with Applications to PMBM Densities

Trajectory Poisson multi-Bernoulli filters