Source author record

Yiqun Zhang

Yiqun Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence Computation and Language math.NA math.OC nlin.CD Numerical Analysis physics.optics Systems and Control

Catalog footprint

What is connected

12works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Detecting Autism Spectrum Disorder with Deep Eye Movement Features

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication and behavioral patterns. Eye movement data offers a non-invasive diagnostic tool for ASD detection, as it is inherently discrete and exhibits short-term temporal dependencies, reflecting localized gaze focus between fixation points. These characteristics enable the data to provide deeper insights into subtle behavioral markers, distinguishing ASD-related patterns from typical development. Eye movement signals mainly contain short-term and localized dependencies. However, despite the widespread application of stacked attention layers in Transformer-based models for capturing long-range dependencies, our experimental results indicate that this approach yields only limited benefits when applied to eye movement data. This may be because discrete fixation points and short-term dependencies in gaze focus reduce the utility of global attention mechanisms, making them less efficient than architectures focusing on local temporal patterns. To efficiently capture subtle and complex eye movement patterns, distinguishing ASD from typically developing (TD) individuals, a discrete short-term sequential (DSTS) modeling framework is designed with Class-aware Representation and Imbalance-aware Mechanisms. Through extensive experiments on several eye movement datasets, DSTS outperforms both traditional machine learning techniques and more sophisticated deep learning models.

preprint2026arXiv

Fine-Tuning Impairs the Balancedness of Foundation Models in Long-tailed Personalized Federated Learning

Personalized federated learning (PFL) with foundation models has emerged as a promising paradigm enabling clients to adapt to heterogeneous data distributions. However, real-world scenarios often face the co-occurrence of non-IID data and long-tailed class distributions, presenting unique challenges that remain underexplored in PFL. In this paper, we investigate this long-tailed personalized federated learning and observe that current methods suffer from two limitations: (i) fine-tuning degrades performance below zero-shot baselines due to the erosion of inherent class balance in foundation models; (ii) conventional personalization techniques further transfer this bias to local models through parameter or feature-level fusion. To address these challenges, we propose Federated Learning via Gradient Purification and Residual Learning (FedPuReL), which preserves balanced knowledge in the global model while enabling unbiased personalization. Specifically, we purify local gradients using zero-shot predictions to maintain a class-balanced global model, and model personalization as residual correction atop the frozen global model. Extensive experiments demonstrate that FedPuReL consistently outperforms state-of-the-art methods, achieving superior performance on both global and personalized models across diverse long-tailed scenarios. The code is available at https://github.com/shihaohou/FedPuReL.

preprint2026arXiv

How Many Visual Tokens Do Multimodal Language Models Need? Scaling Visual Token Pruning with F^3A

Vision-language models improve perception by feeding increasingly long visual token sequences into language backbones, but the resulting inference cost raises a basic scaling question: as multimodal models grow, how many visual tokens are actually needed, and how should they be allocated under a fixed visual token budget? Existing training-free pruning methods typically answer this with one-shot proxies such as decoder attention, visual similarity, or conditional diversity. We argue that visual token pruning is better viewed as task-conditioned evidence search, especially under aggressive compression and across model scales. We propose F^3A, a training-free router for visual token pruning that operates before the language model consumes image tokens. F^3A builds lightweight question-conditioned cues, matches them to visual-grid tokens through frozen sparse sensing heads, and allocates a fixed vision token budget via coarse evidence localization, local refinement, coverage-preserving competition, and recovery of under-covered regions. It requires no model training, no extra LLM forward pass and preserves the original multimodal prompting and decoding pipeline.

preprint2026arXiv

LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing

Large language model (LLM) routing assigns each query to the most suitable model from an ensemble. We introduce LLMRouterBench, a large-scale benchmark and unified framework for LLM routing. It comprises over 400K instances from 21 datasets and 33 models. Moreover, it provides comprehensive metrics for both performance-oriented routing and performance-cost trade-off routing, and integrates 10 representative routing baselines. Using LLMRouterBench, we systematically re-evaluate the field. While confirming strong model complementarity-the central premise of LLM routing-we find that many routing methods exhibit similar performance under unified evaluation, and several recent approaches, including commercial routers, fail to reliably outperform a simple baseline. Meanwhile, a substantial gap remains to the Oracle, driven primarily by persistent model-recall failures. We further show that backbone embedding models have limited impact, that larger ensembles exhibit diminishing returns compared to careful model curation, and that the benchmark also enables latency-aware analysis. All code and data are available at https://github.com/ynulihao/LLMRouterBench.

preprint2026arXiv

One-Shot Hierarchical Federated Clustering

Driven by the growth of Web-scale decentralized services, Federated Clustering (FC) aims to extract knowledge from heterogeneous clients in an unsupervised manner while preserving the clients' privacy, which has emerged as a significant challenge due to the lack of label guidance and the Non-Independent and Identically Distributed (non-IID) nature of clients. In real scenarios such as personalized recommendation and cross-device user profiling, the global cluster may be fragmented and distributed among different clients, and the clusters may exist at different granularities or even nested. Although Hierarchical Clustering (HC) is considered promising for exploring such distributions, the sophisticated recursive clustering process makes it more computationally expensive and vulnerable to privacy exposure, thus relatively unexplored under the federated learning scenario. This paper introduces an efficient one-shot hierarchical FC framework that performs client-end distribution exploration and server-end distribution aggregation through one-way prototype-level communication from clients to the server. A fine partition mechanism is developed to generate successive clusterlets to describe the complex landscape of the clients' clusters. Then, a multi-granular learning mechanism on the server is proposed to fuse the clusterlets, even when they have inconsistent granularities generated from different clients. It turns out that the complex cluster distributions across clients can be efficiently explored, and extensive experiments comparing state-of-the-art methods on ten public datasets demonstrate the superiority of the proposed method.

preprint2026arXiv

Programmable ultra-broadband photonic chaos platform enabled by microwave-chaos-driven electro-optic frequency combs

Optical chaos holds great promise for secure communication, LiDAR, and reinforcement learning. However, its scalability has long been constrained by an intrinsic trade-off between bandwidth and the number of parallel chaotic channels. Here, we introduce a programmable "chaos-on-comb" architecture that overcomes this limitation using standard electro-optic components. By heterodyning a delayed-feedback chaotic laser with a continuous-wave reference, a broadband chaotic microwave signal is generated to simultaneously drive a cascaded electro-optic comb, imprinting chaotic dynamics across all comb lines and merging them into an ultra-broadband chaotic continuum. Then, incorporating spectrum slicing enables flexible extraction of parallel chaotic channels with preserved statistical independence and per-channel programmability. As a result, we demonstrate a single-channel ultra-broadband optical chaos with an effective bandwidth of 543.8 GHz, and a broadband terahertz noise source with an excess noise ratio of 52.99 \pm 2.85 dB to validate its flatness. Furthermore, we employ the uncorrelated parallel chaos for ultrafast photonic decision-making in a 256-armed bandit problem, achieving a favourable power-law scaling exponent of 0.86. Our work paves the way toward programmable, reconfigurable, and application-ready photonic chaos systems.

preprint2026arXiv

SAD: A Large-Scale Strategic Argumentative Dialogue Dataset

Argumentation generation has attracted substantial research interest due to its central role in human reasoning and decision-making. However, most existing argumentative corpora focus on non-interactive, single-turn settings, either generating arguments from a given topic or refuting an existing argument. In practice, however, argumentation is often realized as multi-turn dialogue, where speakers defend their stances and employ diverse argumentative strategies to strengthen persuasiveness. To support deeper modeling of argumentation dialogue, we present the first large-scale \textbf{S}trategic \textbf{A}rgumentative \textbf{D}ialogue dataset, SAD, consisting of 392,822 examples. Grounded in argumentation theories, we annotate each utterance with five strategy types, allowing multiple strategies per utterance. Unlike prior datasets, SAD requires models to generate contextually appropriate arguments conditioned on the dialogue history, a specified stance on the topic, and targeted argumentation strategies. We further benchmark a range of pretrained generative models on SAD and present in-depth analysis of strategy usage patterns in argumentation.

preprint2026arXiv

SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning

In open-world semi-supervised learning (OWSSL), a model learns from labeled data and unlabeled data containing both known and novel classes. In practical OWSSL applications, models are expected to perform rigorous classification by directly selecting the most semantically relevant label from a candidate set for each sample. Existing OWSSL methods fail to achieve this because novel samples are trained without explicit supervision, and these methods lack mechanisms to extract latent semantic information, resulting in predicted labels that have no semantic correspondence to candidate textual labels. To address this, we introduce SEmantic Capture for Open-world Semi-supervised learning (SECOS), which directly predicts textual labels from the candidate set without post-processing, meeting the requirements of practical OWSSL applications. SECOS leverages external knowledge to extract and align semantic representations across modalities for both known and novel classes, providing explicit supervisory signals for training novel classes. Extensive experiments demonstrate that even when existing OWSSL methods are evaluated under the more lenient post-hoc matching setting, SECOS still surpasses them by up to 5.4\% without such assistance, highlighting its superior effectiveness. Code is available at https://github.com/ganchi-huanggua/OSSL-Classification.

preprint2026arXiv

TFEC: Multivariate Time-Series Clustering via Temporal-Frequency Enhanced Contrastive Learning

Multivariate Time-Series (MTS) clustering is crucial for signal processing and data analysis. Although deep learning approaches, particularly those leveraging Contrastive Learning (CL), are prominent for MTS representation, existing CL-based models face two key limitations: 1) neglecting clustering information during positive/negative sample pair construction, and 2) introducing unreasonable inductive biases, e.g., destroying time dependence and periodicity through augmentation strategies, compromising representation quality. This paper, therefore, proposes a Temporal-Frequency Enhanced Contrastive (TFEC) learning framework. To preserve temporal structure while generating low-distortion representations, a temporal-frequency Co-EnHancement (CoEH) mechanism is introduced. Accordingly, a synergistic dual-path representation and cluster distribution learning framework is designed to jointly optimize cluster structure and representation fidelity. Experiments on six real-world benchmark datasets demonstrate TFEC's superiority, achieving 4.48% average NMI gains over SOTA methods, with ablation studies validating the design. The code of the paper is available at: https://github.com/yueliangy/TFEC.

preprint2022arXiv

A new distance measurement and its application in K-Means Algorithm

K-Means clustering algorithm is one of the most commonly used clustering algorithms because of its simplicity and efficiency. K-Means clustering algorithm based on Euclidean distance only pays attention to the linear distance between samples, but ignores the overall distribution structure of the dataset (i.e. the fluid structure of dataset). Since it is difficult to describe the internal structure of two data points by Euclidean distance in high-dimensional data space, we propose a new distance measurement, namely, view-distance, and apply it to the K-Means algorithm. On the classical manifold learning datasets, S-curve and Swiss roll datasets, not only this new distance can cluster the data according to the structure of the data itself, but also the boundaries between categories are neat dividing lines. Moreover, we also tested the classification accuracy and clustering effect of the K-Means algorithm based on view-distance on some real-world datasets. The experimental results show that, on most datasets, the K-Means algorithm based on view-distance has a certain degree of improvement in classification accuracy and clustering effect.

preprint2022arXiv

Log-based Sparse Nonnegative Matrix Factorization for Data Representation

Nonnegative matrix factorization (NMF) has been widely studied in recent years due to its effectiveness in representing nonnegative data with parts-based representations. For NMF, a sparser solution implies better parts-based representation.However, current NMF methods do not always generate sparse solutions.In this paper, we propose a new NMF method with log-norm imposed on the factor matrices to enhance the sparseness.Moreover, we propose a novel column-wisely sparse norm, named $\ell_{2,\log}$-(pseudo) norm to enhance the robustness of the proposed method.The $\ell_{2,\log}$-(pseudo) norm is invariant, continuous, and differentiable.For the $\ell_{2,\log}$ regularized shrinkage problem, we derive a closed-form solution, which can be used for other general problems.Efficient multiplicative updating rules are developed for the optimization, which theoretically guarantees the convergence of the objective value sequence.Extensive experimental results confirm the effectiveness of the proposed method, as well as the enhanced sparseness and robustness.

preprint2018arXiv

Optimal Two-impulse Space Interception with Multi-constraints

We consider optimal two-impulse space interception problems with multi-constraints. The multi-constraints are imposed on the terminal position of an interceptor, impulse and impact instants, and the component-wise magnitudes of velocity impulses. We formulate these optimization problems as multi-point boundary value problems and the calculus of variations is used to solve them. All inequality constraints are converted into equality constraints by using slackness variable methods in order to use Lagrange multiplier method. A new dynamic slackness variable method is presented. As a result, an indirect optimization method is established for two-impulse space interception problems with multi-constraints. Subsequently, our method is used to solve the two-impulse space interception problems of free-flight ballistic missiles. A number of conclusions have been established based on highly accurate numerical solutions. Specifically, by numerical examples, we show that when time and velocity impulse constraints are imposed, optimal two-impulse solutions may occur, and also if two impulse instants are free, then two-impulse space interception problems with velocity impulse constraints may degenerate to the one-impulse case.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint