Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning

LLM-powered coding agents spend the majority of their token budget reading repository files, yet much of the retrieved code is irrelevant to the task at hand. Existing learned pruners compress this context with a single-objective sequence labeler, collapsing all facets of code relevance into one score and one transition matrix. We show that this formulation creates a modeling bottleneck: a single CRF transition prior must serve heterogeneous retention patterns, including contiguous semantic spans and sparse structural support lines. We propose LaMR (Latent Multi-Rubric), a structured pruning framework that decomposes code relevance into two interpretable quality dimensions, semantic evidence and dependency support, each modeled by a dedicated CRF with dimension-specific transition dynamics. A mixture-of-experts gating network dynamically weights the per-rubric emissions conditioned on the query, and a final CRF layer on the fused emissions produces the aggregate keep-or-prune decision. To supervise each dimension without additional annotation cost, we derive multi-rubric labels from the existing training corpus via AST-based program analysis, simultaneously denoising the teacher's binary labels. By effectively filtering distracting noise, LaMR frequently matches or even outperforms unpruned full-context baselines. Experiments on four benchmarks (SWE-Bench Verified, SWE-QA, LCC, LongCodeQA) show that LaMR wins 12 of 16 head-to-head multi-turn comparisons. It saves up to 31% more tokens on multi-turn agent tasks and improves Exact Match by up to +3.5 on single-turn tasks, while performance is frequently enhanced by denoising the context, and any remaining drops are marginal.

preprint2026arXiv

Exponentially Consistent Low Complexity Tests for Outlier Hypothesis Testing

We revisit outlier hypothesis testing, propose exponentially consistent low complexity fixed-length and sequential tests and show that our tests achieve better tradeoff between detection performance and computational complexity than existing tests that use exhaustive search. Specifically, in outlier hypothesis testing, one is given a list of observed sequences, most of which are generated i.i.d. from a nominal distribution while the rest sequences named outliers are generated i.i.d. from another anomalous distribution. The task is to identify all outliers when both the nominal and anomalous distributions are unknown. There are two basic settings: fixed-length and sequential. In the fixed-length setting, the sample size of each observed sequence is fixed a priori while in the sequential setting, the sample size is a random number that can be determined by the test designer to ensure reliable decisions. For the fixed-length setting, we strengthen the results of Bu \emph{et. al} (TSP 2019) by i) allowing for scoring functions beyond KL divergence and further simplifying the test design when the number of outliers is known and ii) proposing a new test, explicitly bounding the detection performance of the test and characterizing the tradeoff among exponential decay rates of three error probabilities when the number of outliers is unknown. For the sequential setting, our tests for both cases are novel and enable us to reveal the benefit of sequentiality. Finally, for both fixed-length and sequential settings, we demonstrate the penalty of not knowing the number of outliers on the detection performance.

preprint2026arXiv

Learning Fingerprints for Medical Time Series with Redundancy-Constrained Information Maximization

Learning meaningful representations from medical time series (MedTS) such as ECG or EEG signals is a critical challenge. These signals are often high-dimensional, variable-length and rife with noise. Existing self-supervised approaches, such as Masked Autoencoders (MAEs) are highly effective for pre-training general-purpose encoders. However, they do not explicitly learn compact and semantically interpretable latent representations, typically relying on heuristic aggregation strategies such as global average pooling or a designated [CLS] token. We propose a novel framework that compresses a variable-length MedTS into a fixed-size set of $k$ latent Fingerprint Tokens. Our architecture employs a cross-attention bottleneck to generate these tokens and is trained with a dual-objective function. The first objective is a reconstruction loss, which ensures the tokens are \textit{sufficient statistics} for the original data. The second, a diversity penalty based on the Total Coding Rate (TCR), explicitly minimizes the redundancy between tokens, encouraging them to become statistically \textit{disentangled} representations. We present the theoretical justification for our method, framing it as a novel \textbf{Disentangled Rate-Distortion} problem. This approach produces a low-dimensional, interpretable, and sample-efficient representation, where each token is encouraged to capture an independent factor of variation, paving the way for more robust digital biomarkers.

preprint2026arXiv

Nodule-DETR: A Novel DETR Architecture with Frequency-Channel Attention for Ultrasound Thyroid Nodule Detection

Thyroid cancer is the most common endocrine malignancy, and its incidence is rising globally. While ultrasound is the preferred imaging modality for detecting thyroid nodules, its diagnostic accuracy is often limited by challenges such as low image contrast and blurred nodule boundaries. To address these issues, we propose Nodule-DETR, a novel detection transformer (DETR) architecture designed for robust thyroid nodule detection in ultrasound images. Nodule-DETR introduces three key innovations: a Multi-Spectral Frequency-domain Channel Attention (MSFCA) module that leverages frequency analysis to enhance features of low-contrast nodules; a Hierarchical Feature Fusion (HFF) module for efficient multi-scale integration; and Multi-Scale Deformable Attention (MSDA) to flexibly capture small and irregularly shaped nodules. We conducted extensive experiments on a clinical dataset of real-world thyroid ultrasound images. The results demonstrate that Nodule-DETR achieves state-of-the-art performance, outperforming the baseline model by a significant margin of 0.149 in mAP@0.5:0.95. The superior accuracy of Nodule-DETR highlights its significant potential for clinical application as an effective tool in computer-aided thyroid diagnosis. The code of work is available at https://github.com/wjj1wjj/Nodule-DETR.

preprint2026arXiv

Prior-Guided DETR for Ultrasound Nodule Detection

Accurate detection of ultrasound nodules is essential for the early diagnosis and treatment of thyroid and breast cancers. However, this task remains challenging due to irregular nodule shapes, indistinct boundaries, substantial scale variations, and the presence of speckle noise that degrades structural visibility. To address these challenges, we propose a prior-guided DETR framework specifically designed for ultrasound nodule detection. Instead of relying on purely data-driven feature learning, the proposed framework progressively incorporates different prior knowledge at multiple stages of the network. First, a Spatially-adaptive Deformable FFN with Prior Regularization (SDFPR) is embedded into the CNN backbone to inject geometric priors into deformable sampling, stabilizing feature extraction for irregular and blurred nodules. Second, a Multi-scale Spatial-Frequency Feature Mixer (MSFFM) is designed to extract multi-scale structural priors, where spatial-domain processing emphasizes contour continuity and boundary cues, while frequency-domain modeling captures global morphology and suppresses speckle noise. Furthermore, a Dense Feature Interaction (DFI) mechanism propagates and exploits these prior-modulated features across all encoder layers, enabling the decoder to enhance query refinement under consistent geometric and structural guidance. Experiments conducted on two clinically collected thyroid ultrasound datasets (Thyroid I and Thyroid II) and two public benchmarks (TN3K and BUSI) for thyroid and breast nodules demonstrate that the proposed method achieves superior accuracy compared with 18 detection methods, particularly in detecting morphologically complex nodules.The source code is publicly available at https://github.com/wjj1wjj/Ultrasound-DETR.

preprint2022arXiv

Few-shot One-class Domain Adaptation Based on Frequency for Iris Presentation Attack Detection

Iris presentation attack detection (PAD) has achieved remarkable success to ensure the reliability and security of iris recognition systems. Most existing methods exploit discriminative features in the spatial domain and report outstanding performance under intra-dataset settings. However, the degradation of performance is inevitable under cross-dataset settings, suffering from domain shift. In consideration of real-world applications, a small number of bonafide samples are easily accessible. We thus define a new domain adaptation setting called Few-shot One-class Domain Adaptation (FODA), where adaptation only relies on a limited number of target bonafide samples. To address this problem, we propose a novel FODA framework based on the expressive power of frequency information. Specifically, our method integrates frequency-related information through two proposed modules. Frequency-based Attention Module (FAM) aggregates frequency information into spatial attention and explicitly emphasizes high-frequency fine-grained features. Frequency Mixing Module (FMM) mixes certain frequency components to generate large-scale target-style samples for adaptation with limited target bonafide samples. Extensive experiments on LivDet-Iris 2017 dataset demonstrate the proposed method achieves state-of-the-art or competitive performance under both cross-dataset and intra-dataset settings.

preprint2022arXiv

NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Gaze estimation is the fundamental basis for many visual tasks. Yet, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we propose a novel Head-Eye redirection parametric model based on Neural Radiance Field, which allows dense gaze data generation with view consistency and accurate gaze direction. Moreover, our head-eye redirection parametric model can decouple the face and eyes for separate neural rendering, so it can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction. Thus diverse 3D-aware gaze datasets could be obtained by manipulating the latent code belonging to different face attributions in an unsupervised manner. Extensive experiments on several benchmarks demonstrate the effectiveness of our method in domain generalization and domain adaptation for gaze estimation tasks.

preprint2022arXiv

Semi-supervised Ranking for Object Image Blur Assessment

Assessing the blurriness of an object image is fundamentally important to improve the performance for object recognition and retrieval. The main challenge lies in the lack of abundant images with reliable labels and effective learning strategies. Current datasets are labeled with limited and confused quality levels. To overcome this limitation, we propose to label the rank relationships between pairwise images rather their quality levels, since it is much easier for humans to label, and establish a large-scale realistic face image blur assessment dataset with reliable labels. Based on this dataset, we propose a method to obtain the blur scores only with the pairwise rank labels as supervision. Moreover, to further improve the performance, we propose a self-supervised method based on quadruplet ranking consistency to leverage the unlabeled data more effectively. The supervised and self-supervised methods constitute a final semi-supervised learning framework, which can be trained end-to-end. Experimental results demonstrate the effectiveness of our method.

preprint2022arXiv

Underwater Differential Game: Finite-Time Target Hunting Task with Communication Delay

This work considers designing an unmanned target hunting system for a swarm of unmanned underwater vehicles (UUVs) to hunt a target with high maneuverability. Differential game theory is used to analyze combat policies of UUVs and the target within finite time. The challenge lies in UUVs must conduct their control policies in consideration of not only the consistency of the hunting team but also escaping behaviors of the target. To obtain stable feedback control policies satisfying Nash equilibrium, we construct the Hamiltonian function with Leibniz's formula. For further taken underwater disturbances and communication delay into consideration, modified deep reinforcement learning (DRL) is provided to investigate the underwater target hunting task in an unknown dynamic environment. Simulations show that underwater disturbances have a large impact on the system considering communication delay. Moreover, consistency tests show that UUVs perform better consistency with a relatively small range of disturbances.

preprint2022arXiv

Unimodal-Concentrated Loss: Fully Adaptive Label Distribution Learning for Ordinal Regression

Learning from a label distribution has achieved promising results on ordinal regression tasks such as facial age and head pose estimation wherein, the concept of adaptive label distribution learning (ALDL) has drawn lots of attention recently for its superiority in theory. However, compared with the methods assuming fixed form label distribution, ALDL methods have not achieved better performance. We argue that existing ALDL algorithms do not fully exploit the intrinsic properties of ordinal regression. In this paper, we emphatically summarize that learning an adaptive label distribution on ordinal regression tasks should follow three principles. First, the probability corresponding to the ground-truth should be the highest in label distribution. Second, the probabilities of neighboring labels should decrease with the increase of distance away from the ground-truth, i.e., the distribution is unimodal. Third, the label distribution should vary with samples changing, and even be distinct for different instances with the same label, due to the different levels of difficulty and ambiguity. Under the premise of these principles, we propose a novel loss function for fully adaptive label distribution learning, namely unimodal-concentrated loss. Specifically, the unimodal loss derived from the learning to rank strategy constrains the distribution to be unimodal. Furthermore, the estimation error and the variance of the predicted distribution for a specific sample are integrated into the proposed concentrated loss to make the predicted distribution maximize at the ground-truth and vary according to the predicting uncertainty. Extensive experimental results on typical ordinal regression tasks including age and head pose estimation, show the superiority of our proposed unimodal-concentrated loss compared with existing loss functions.

preprint2022arXiv

Weakly Supervised Regional and Temporal Learning for Facial Action Unit Recognition

Automatic facial action unit (AU) recognition is a challenging task due to the scarcity of manual annotations. To alleviate this problem, a large amount of efforts has been dedicated to exploiting various weakly supervised methods which leverage numerous unlabeled data. However, many aspects with regard to some unique properties of AUs, such as the regional and relational characteristics, are not sufficiently explored in previous works. Motivated by this, we take the AU properties into consideration and propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance in a self-supervised manner via the unlabeled data. Specifically, to enhance the discrimination of regional features with AU relation embedding, we design a task of RoI inpainting to recover the randomly cropped AU patches. Meanwhile, a single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles and encode the motion information into the global feature representation. Based on these two self-supervised auxiliary tasks, local features, mutual relation and motion cues of AUs are better captured in the backbone network. Furthermore, by incorporating semi-supervised learning, we propose an end-to-end trainable framework named weakly supervised regional and temporal learning (WSRTL) for AU recognition. Extensive experiments on BP4D and DISFA demonstrate the superiority of our method and new state-of-the-art performances are achieved.

preprint2021arXiv

Multi-Level Adaptive Region of Interest and Graph Learning for Facial Action Unit Recognition

In facial action unit (AU) recognition tasks, regional feature learning and AU relation modeling are two effective aspects which are worth exploring. However, the limited representation capacity of regional features makes it difficult for relation models to embed AU relationship knowledge. In this paper, we propose a novel multi-level adaptive ROI and graph learning (MARGL) framework to tackle this problem. Specifically, an adaptive ROI learning module is designed to automatically adjust the location and size of the predefined AU regions. Meanwhile, besides relationship between AUs, there exists strong relevance between regional features across multiple levels of the backbone network as level-wise features focus on different aspects of representation. In order to incorporate the intra-level AU relation and inter-level AU regional relevance simultaneously, a multi-level AU relation graph is constructed and graph convolution is performed to further enhance AU regional features of each level. Experiments on BP4D and DISFA demonstrate the proposed MARGL significantly outperforms the previous state-of-the-art methods.

preprint2021arXiv

Self-Domain Adaptation for Face Anti-Spoofing

Although current face anti-spoofing methods achieve promising results under intra-dataset testing, they suffer from poor generalization to unseen attacks. Most existing works adopt domain adaptation (DA) or domain generalization (DG) techniques to address this problem. However, the target domain is often unknown during training which limits the utilization of DA methods. DG methods can conquer this by learning domain invariant features without seeing any target data. However, they fail in utilizing the information of target data. In this paper, we propose a self-domain adaptation framework to leverage the unlabeled test domain data at inference. Specifically, a domain adaptor is designed to adapt the model for test domain. In order to learn a better adaptor, a meta-learning based adaptor learning algorithm is proposed using the data of multiple source domains at the training step. At test time, the adaptor is updated using only the test domain data according to the proposed unsupervised adaptor loss to further improve the performance. Extensive experiments on four public datasets validate the effectiveness of the proposed method.

preprint2020arXiv

Access Strategy in Super WiFi Network Powered by Solar Energy Harvesting: A POMDP Method

The recently announced Super Wi-Fi Network proposal in United States is aiming to enable Internet access in a nation-wide area. As traditional cable-connected power supply system becomes impractical or costly for a wide range wireless network, new infrastructure deployment for Super Wi-Fi is required. The fast developing Energy Harvesting (EH) techniques receive global attentions for their potential of solving the above power supply problem. It is a critical issue, from the user's perspective, how to make efficient network selection and access strategies. Unlike traditional wireless networks, the battery charge state and tendency in EH based networks have to be taken into account when making network selection and access, which has not been well investigated. In this paper, we propose a practical and efficient framework for multiple base stations access strategy in an EH powered Super Wi-Fi network. We consider the access strategy from the user's perspective, who exploits downlink transmission opportunities from one base station. To formulate the problem, we used Partially Observable Markov Decision Process (POMDP) to model users' observations on the base stations' battery situation and decisions on the base station selection and access. Simulation results show that our methods are efficacious and significantly outperform the traditional widely used CSMA method.

preprint2020arXiv

Aggressive Congestion Control Mechanism for Space Systems

How to implement an impeccable space system-of-systems (SoS) internetworking architecture has been a significant issue in system engineering for years. Reliable data transmission is considered one of the most important technologies of space SoS internetworking systems. Due to the high bit error rate (BER), long time delay and asymmetrical channel in the space communication environment, the congestion control mechanism of classic transport control protocols (TCP) shows unsatisfying performances. With the help of existing TCP modifications, this paper contributes an aggressive congestion control mechanism. The proposed mechanism is characterized with a fast start procedure, as well as the feedback information to analyze network traffic and with a link terminating processing mechanism, which can help to reveal the real reason of packet loss, and maintain the size of congestion window at a high level. Simulation results are shown in the end to verify the proposed scheme.

preprint2020arXiv

Complex Network Theoretical Analysis on Information Dissemination over Vehicular Networks

How to enhance the communication efficiency and quality on vehicular networks is one critical important issue. While with the larger and larger scale of vehicular networks in dense cities, the real-world datasets show that the vehicular networks essentially belong to the complex network model. Meanwhile, the extensive research on complex networks has shown that the complex network theory can both provide an accurate network illustration model and further make great contributions to the network design, optimization and management. In this paper, we start with analyzing characteristics of a taxi GPS dataset and then establishing the vehicular-to-infrastructure, vehicle-to-vehicle and the hybrid communication model, respectively. Moreover, we propose a clustering algorithm for station selection, a traffic allocation optimization model and an information source selection model based on the communication performances and complex network theory.

preprint2020arXiv

Efficient Sampling Algorithms for Approximate Temporal Motif Counting (Extended Version)

A great variety of complex systems ranging from user interactions in communication networks to transactions in financial markets can be modeled as temporal graphs, which consist of a set of vertices and a series of timestamped and directed edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs which take into account edge orderings and durations in addition to structures. Counting the number of occurrences of temporal motifs is a fundamental problem for temporal network analysis. However, existing methods either cannot support temporal motifs or suffer from performance issues. In this paper, we focus on approximate temporal motif counting via random sampling. We first propose a generic edge sampling (ES) algorithm for estimating the number of instances of any temporal motif. Furthermore, we devise an improved EWS algorithm that hybridizes edge sampling with wedge sampling for counting temporal motifs with 3 vertices and 3 edges. We provide comprehensive analyses of the theoretical bounds and complexities of our proposed algorithms. Finally, we conduct extensive experiments on several real-world datasets, and the results show that our ES and EWS algorithms have higher efficiency, better accuracy, and greater scalability than the state-of-the-art sampling method for temporal motif counting.

preprint2020arXiv

Mobile Data Transactions in Device-to-Device Communication Networks: Pricing and Auction

Device-to-Device (D2D) communication is offering smart phone users a choice to share files with each other without communicating with the cellular network. In this paper, we discuss the behaviors of two characters in the D2D data transaction model from an economic point of view: the data buyers who wish to buy a certain quantity of data, as well as the data sellers who wish to sell data through the D2D network. The optimal price and purchasing strategies are analyzed and deduced based on game theory.

preprint2020arXiv

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.

preprint2019arXiv

On Deflection of Solar Coronal Mass Ejections by the Ambient Coronal Magnetic Field Configuration

Solar Coronal Mass Ejections (CMEs) are sometimes deflected during their propagation. This deflection may be the consequence of interaction between a CME and a coronal hole or the solar wind. We analyze 44 halo-CMEs whose deflection angle exceeds 90 degrees. The coronal magnetic field configuration is computed from daily synoptic maps of magnetic field from SOHO/MDI and SDO/HMI using a Potential Field Source Surface (PFSS) model. By comparing the ambient magnetic field configuration and the measured position angles (MPA) of the CMEs, we conclude that the deflection of 80% of the CMEs (35 of 44) are consistent with the ambient magnetic field configuration, agreeing with previous studies. Of these 35, 71% are deflected toward the heliospheric current sheet (HCS), and 29 degrees toward a pseudo-streamer (PS), the boundary between the same-polarity magnetic field regions. This implies that the ambient coronal magnetic field configuration plays an important and major role in the deflection of CMEs, and that the HCS configuration is more important than PS. If we exclude 13 CMEs having much higher uncertainty from the sample, the agreement between the deflection of CMEs and the ambient field configuration increases substantially, reaching 94% in the new sample of 31 CMEs.