Source author record

Zhiwei Yang

Zhiwei Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Computer Vision cond-mat.mtrl-sci Cryptography and Security Machine Learning physics.chem-ph physics.comp-ph physics.flu-dyn physics.ins-det Software Engineering

Catalog footprint

What is connected

9works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

BIDO: An Out-Of-Distribution Resistant Image-based Malware Detector

While image-based detectors have shown promise in Android malware detection, they often struggle to maintain their performance and interpretability when encountering out-of-distribution (OOD) samples. Specifically, OOD samples generated by code obfuscation and concept drift exhibit distributions that significantly deviate from the detector's training data. Such shifts not only severely undermine the generalisation of detectors to OOD samples but also compromise the reliability of their associated interpretations. To address these challenges, we propose BIDO, a novel generative classifier that reformulates malware detection as a likelihood estimation task. Unlike conventional discriminative methods, BIDO jointly produces classification results and interpretations by explicitly modeling class-conditional distributions, thereby resolving the long-standing separation between detection and explanation. Empirical results demonstrate that BIDO substantially enhances robustness against extreme obfuscation and concept drift while achieving reliable interpretation without sacrificing performance. The source code is available at https://github.com/whatishope/BIDO/.

preprint2026arXiv

DiCLIP: Diffusion Model Enhances CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation (WSSS) with image-level labels typically leverages Class Activation Maps (CAMs) to achieve pixel-level predictions. Recently, Contrastive Language-Image Pre-training (CLIP) has been introduced to generate CAMs in WSSS. However, previous WSSS methods solely adopt CLIP's vision-language paired property for dense localization, neglecting its inherently limited dense knowledge across both visual and text modalities, which renders CAM generation suboptimal. In this work, we propose DiCLIP, a novel WSSS framework that leverages the generative diffusion model to enhance CLIP's dense knowledge across two modalities. Specifically, Visual Correlation Enhancement (VCE) and Text Semantic Augmentation (TSA) modules are proposed for dense prediction enhancement. To improve the spatial awareness of visual features, our VCE module utilizes diffusion's reliable spatial consistency to mitigate the over-smoothing issue in CLIP's attention. It designs the Attention Clustering Refinement (ACR) module to reliably extract diverse correlation maps from the diffusion model. The correlation maps act as a diversity bias for CLIP's self-attention, recursively pushing its visual features towards a more discriminative dense distribution. To augment the semantics of text embeddings, our TSA module argues that a single text modality is insufficient to encompass the variability of visual categories. Thus, we leverage diffusion's generative power to maintain a dynamic key-value cache model, shifting CAM generation from a patch-text matching mechanism to a novel visual knowledge retrieval paradigm. With these enhancements, DiCLIP not only outperforms state-of-the-art methods on PASCAL VOC and MS COCO but also significantly reduces training costs. Code is publicly available at https://github.com/zwyang6/DiCLIP.

preprint2026arXiv

Tracking Large-scale Shared Bikes with Inertial Motion Learning in GNSS Blocked Environments

Although Global Navigation Satellite Systems (GNSS) provide a general solution for bike tracking outdoors, there still exist complex riding environments where only inertial navigation systems work, such as urban canyons. Despite decades of research, localization using only low-cost inertial sensors still faces challenges such as cumulative drifts and poor robustness caused by filtering methods. Furthermore, sensors such as visual and LiDAR could provide reliable measurements, but they are not suitable for large-scale deployment. In this paper, we propose an inertial tracking framework that integrates bicycle mechanical constraints with a mixture-of-experts model. Specifically, we leverage multiple expert modules to capture shared representations and weight them through the gating mechanism, thus improving multi-task learning performance and enabling uncertainty-aware trajectory estimation. Furthermore, based on the mechanical transmission between the pedal and the rear wheel of a bike, we explore the intrinsic relationship between the rider's periodic pedalling behaviors and acceleration variations, and convert such patterns into bike's wheel speed for dynamic calibration. Experiments with real-world riding data from shared bikes of the DiDi ride-hailing platform demonstrate that our system improves the accuracy of baselines by at least 12%, with wheel speed errors below 0.5 m/s at 95-percentile.

preprint2022arXiv

DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks

Since 2017, the Transformer-based models play critical roles in various downstream Natural Language Processing tasks. However, a common limitation of the attention mechanism utilized in Transformer Encoder is that it cannot automatically capture the information of word order, so explicit position embeddings are generally required to be fed into the target model. In contrast, Transformer Decoder with the causal attention masks is naturally sensitive to the word order. In this work, we focus on improving the position encoding ability of BERT with the causal attention masks. Furthermore, we propose a new pre-trained language model DecBERT and evaluate it on the GLUE benchmark. Experimental results show that (1) the causal attention mask is effective for BERT on the language understanding tasks; (2) our DecBERT model without position embeddings achieve comparable performance on the GLUE benchmark; and (3) our modification accelerates the pre-training process and DecBERT w/ PE achieves better overall performance than the baseline systems when pre-training with the same amount of computational resources.

preprint2022arXiv

Detect Rumors in Microblog Posts for Low-Resource Domains via Adversarial Contrastive Learning

Massive false rumors emerging along with breaking news or trending topics severely hinder the truth. Existing rumor detection approaches achieve promising performance on the yesterday's news, since there is enough corpus collected from the same domain for model training. However, they are poor at detecting rumors about unforeseen events especially those propagated in different languages due to the lack of training data and prior knowledge (i.e., low-resource regimes). In this paper, we propose an adversarial contrastive learning framework to detect rumors by adapting the features learned from well-resourced rumor data to that of the low-resourced. Our model explicitly overcomes the restriction of domain and/or language usage via language alignment and a novel supervised contrastive training paradigm. Moreover, we develop an adversarial augmentation mechanism to further enhance the robustness of low-resource rumor representation. Extensive experiments conducted on two low-resource datasets collected from real-world microblog platforms demonstrate that our framework achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.

preprint2020arXiv

Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision

Violence detection has been studied in computer vision for years. However, previous work are either superficial, e.g., classification of short-clips, and the single scenario, or undersupplied, e.g., the single modality, and hand-crafted features based multimodality. To address this problem, in this work we first release a large-scale and multi-scene dataset named XD-Violence with a total duration of 217 hours, containing 4754 untrimmed videos with audio signals and weak labels. Then we propose a neural network containing three parallel branches to capture different relations among video snippets and integrate features, where holistic branch captures long-range dependencies using similarity prior, localized branch captures local positional relation using proximity prior, and score branch dynamically captures the closeness of predicted score. Besides, our method also includes an approximator to meet the needs of online detection. Our method outperforms other state-of-the-art methods on our released dataset and other existing benchmark. Moreover, extensive experimental results also show the positive effect of multimodal (audio-visual) input and modeling relationships. The code and dataset will be released in https://roc-ng.github.io/XD-Violence/.

preprint2014arXiv

A Carbon Corrosion Model to Evaluate the Effect of Steady State and Transient Operation of a Polymer Electrolyte Membrane Fuel Cell

A carbon corrosion model is developed based on the formation of surface oxides on carbon and platinum of the polymer electrolyte membrane fuel cell electrode. The model predicts the rate of carbon corrosion under potential hold and potential cycling conditions. The model includes the interaction of carbon surface oxides with transient species like OH radicals to explain observed carbon corrosion trends under normal PEM fuel cell operating conditions. The model prediction agrees qualitatively with the experimental data supporting the hypothesis that the interplay of surface oxide formation on carbon and platinum is the primary driver of carbon corrosion.

preprint2011arXiv

Bed-inventory Overturn Mechanism for Pant-leg Circulating Fluidized Bed Boilers

A numerical model was established to investigate the lateral mass transfer as well as the mechanism of bed-inventory overturn inside a pant-leg circulating fluidized bed (CFB), which are of great importance to maintain safe and efficient operation of the CFB. Results show that the special flow structure in which the solid particle volume fraction along the central line of the pant-leg CFB is relative high enlarges the lateral mass transfer rate and make it more possible for bed inventory overturn. Although the lateral pressure difference generated from lateral mass transfer inhibits continuing lateral mass transfer, providing the pant-leg CFB with self-balancing ability to some extent, the primary flow rate change due to the outlet pressure change often disable the self-balancing ability by continually enhancing the flow rate difference. As the flow rate of the primary air fan is more sensitive to its outlet pressure, it is easier to lead to bed inventory overturn. While when the solid particle is easier to change its flow patter to follow the surrounding air flow,the self-balancing ability is more active.

preprint2011arXiv

LHC Timing Signal Distribution and Beam Phase Monitoring Through ATLAS BPTX Monitoring System

The LHC timing signals are broadcast to various destinations in the subsequent experiments. And all of them could be delayed to time in the ATLAS sub-detectors by using the Corde board and the RF2TTC module. The ATLAS BPTX detectors and monitoring system could guarantee the demanded sychronization of the beam in the futher collision.

Zhiwei Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

BIDO: An Out-Of-Distribution Resistant Image-based Malware Detector

DiCLIP: Diffusion Model Enhances CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Tracking Large-scale Shared Bikes with Inertial Motion Learning in GNSS Blocked Environments

DecBERT: Enhancing the Language Understanding of BERT with Causal Attention Masks

Detect Rumors in Microblog Posts for Low-Resource Domains via Adversarial Contrastive Learning

Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision

A Carbon Corrosion Model to Evaluate the Effect of Steady State and Transient Operation of a Polymer Electrolyte Membrane Fuel Cell

Bed-inventory Overturn Mechanism for Pant-leg Circulating Fluidized Bed Boilers

LHC Timing Signal Distribution and Beam Phase Monitoring Through ATLAS BPTX Monitoring System