Researcher profile

Xu Cheng

Xu Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

SCRWKV: Ultra-Compact Structure-Calibrated Vision-RWKV for Topological Crack Segmentation

Achieving pixel-level accurate segmentation of structural cracks across diverse scenarios remains a formidable challenge. Existing methods face significant bottlenecks in balancing crack topology modeling with computational efficiency, often failing to reconcile high segmentation quality with low resource demands. To address these limitations, we propose the Ultra-Compact Structure-Calibrated Vision RWKV (SCRWKV), a network that achieves high-precision modeling via a novel Structure-Field Encoder (SFE) backbone while maintaining linear complexity. The SFE integrates the Adaptive Multi-scale Cascaded Modulator (AMCM) to enhance texture representation and utilizes the Structure-Calibrated Insight Unit (SCIU) as its core engine. Specifically, the SCIU employs the Geometry-guided Bidirectional Structure Transformation (GBST) to capture topological correlations and integrates the Dynamic Self-Calibrating Decay (DSCD) into Dy-WKV to suppress noise propagation. Furthermore, we introduce a lightweight Cross-Scale Harmonic Fusion (CSHF) decoder to achieve precise feature aggregation. Systematic evaluations on multiple benchmarks characterized by complex textures and severe interference demonstrate that SCRWKV, with only 1.22M parameters, significantly outperforms SOTA methods. Achieving an F1 score of 0.8428 and mIoU of 0.8512 on the TUT dataset, the model confirms its robust potential for efficient real-world deployment. The code is available at https://github.com/zhxhzy/SCRWKV.

preprint2023arXiv

Advancing 3D finger knuckle recognition via deep feature learning

Contactless 3D finger knuckle patterns have emerged as an effective biometric identifier due to its discriminativeness, visibility from a distance, and convenience. Recent research has developed a deep feature collaboration network which simultaneously incorporates intermediate features from deep neural networks with multiple scales. However, this approach results in a large feature dimension, and the trained classification layer is required for comparing probe samples, which limits the introduction of new classes. This paper advances this approach by investigating the possibility of learning a discriminative feature vector with the least possible dimension for representing 3D finger knuckle images. Experimental results are presented using a publicly available 3D finger knuckle images database with comparisons to popular deep learning architectures and the state-of-the-art 3D finger knuckle recognition methods. The proposed approach offers outperforming results in classification and identification tasks under the more practical feature comparison scenario, i.e., using the extracted deep feature instead of the trained classification layer for comparing probe samples. More importantly, this approach can offer 99% reduction in the size of feature templates, which is highly attractive for deploying biometric systems in the real world. Experiments are also performed using other two public biometric databases with similar patterns to ascertain the effectiveness and generalizability of our proposed approach.

preprint2022arXiv

A Review of Federated Learning in Energy Systems

With increasing concerns for data privacy and ownership, recent years have witnessed a paradigm shift in machine learning (ML). An emerging paradigm, federated learning (FL), has gained great attention and has become a novel design for machine learning implementations. FL enables the ML model training at data silos under the coordination of a central server, eliminating communication overhead and without sharing raw data. In this paper, we conduct a review of the FL paradigm and, in particular, compare the types, the network structures, and the global model aggregation methods. Then, we conducted a comprehensive review of FL applications in the energy domain (refer to the smart grid in this paper). We provide a thematic classification of FL to address a variety of energy-related problems, including demand response, identification, prediction, and federated optimizations. We describe the taxonomy in detail and conclude with a discussion of various aspects, including challenges, opportunities, and limitations in its energy informatics applications, such as energy system modeling and design, privacy, and evolution.

preprint2022arXiv

Abusing Cache Line Dirty States to Leak Information in Commercial Processors

Caches have been used to construct various types of covert and side channels to leak information. Most existing cache channels exploit the timing difference between cache hits and cache misses. However, we introduce a new and broader classification of cache covert channel attacks: Hit+Miss, Hit+Hit, and Miss+Miss. We highlight that cache misses for cache lines in different states may have more significant time differences, and these can be used as timing channels. Based on this classification, we propose a new stable and stealthy Miss+Miss cache channel. Write-back caches are widely deployed in modern processors. This paper presents in detail a way in which replacement latency differences can be used to construct timing-based channels (called WB channels) to leak information in a write-back cache. Any modification to a cache line by a sender will set it to the dirty state, and the receiver can observe this through measuring the latency of replacing this cache set. We also demonstrate how senders could exploit a different number of dirty cache lines in a cache set to improve transmission bandwidth with symbols encoding multiple bits. The peak transmission bandwidths of the WB channels in commercial systems can vary between 1300 and 4400~kbps per cache set in a hyper-threaded setting without shared memory between the sender and the receiver. In contrast to most existing cache channels, which always target specific memory addresses, the new WB channels focus on the cache set and cache line states, making it difficult for the channel to be disturbed by other processes on the core, and they can still work in a cache using a random replacement policy. We also analyzed the stealthiness of WB channels from the perspective of the number of cache loads and cache miss rates. We discuss and evaluate possible defenses. The paper finishes by discussing various forms of side-channel attack.

preprint2022arXiv

GRAND+: Scalable Graph Random Neural Networks

Graph neural networks (GNNs) have been widely adopted for semi-supervised learning on graphs. A recent study shows that the graph random neural network (GRAND) model can generate state-of-the-art performance for this problem. However, it is difficult for GRAND to handle large-scale graphs since its effectiveness relies on computationally expensive data augmentation procedures. In this work, we present a scalable and high-performance GNN framework GRAND+ for semi-supervised graph learning. To address the above issue, we develop a generalized forward push (GFPush) algorithm in GRAND+ to pre-compute a general propagation matrix and employ it to perform graph data augmentation in a mini-batch manner. We show that both the low time and space complexities of GFPush enable GRAND+ to efficiently scale to large graphs. Furthermore, we introduce a confidence-aware consistency loss into the model optimization of GRAND+, facilitating GRAND+'s generalization superiority. We conduct extensive experiments on seven public datasets of different sizes. The results demonstrate that GRAND+ 1) is able to scale to large graphs and costs less running time than existing scalable GNNs, and 2) can offer consistent accuracy improvements over both full-batch and scalable GNNs across all datasets.

preprint2022arXiv

Masked Autoencoders for Generic Event Boundary Detection CVPR'2022 Kinetics-GEBD Challenge

Generic Event Boundary Detection (GEBD) tasks aim at detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. In this paper, we apply Masked Autoencoders to improve algorithm performance on the GEBD tasks. Our approach mainly adopted the ensemble of Masked Autoencoders fine-tuned on the GEBD task as a self-supervised learner with other base models. Moreover, we also use a semi-supervised pseudo-label method to take full advantage of the abundant unlabeled Kinetics-400 data while training. In addition, we propose a soft-label method to partially balance the positive and negative samples and alleviate the problem of ambiguous labeling in this task. Lastly, a tricky segmentation alignment policy is implemented to refine boundaries predicted by our models to more accurate locations. With our approach, we achieved 85.94% on the F1-score on the Kinetics-GEBD test set, which improved the F1-score by 2.31% compared to the winner of the 2021 Kinetics-GEBD Challenge. Our code is available at https://github.com/ContentAndMaterialPortrait/MAE-GEBD.

preprint2022arXiv

Quantifying the Knowledge in a DNN to Explain Knowledge Distillation for Classification

Compared to traditional learning from scratch, knowledge distillation sometimes makes the DNN achieve superior performance. This paper provides a new perspective to explain the success of knowledge distillation, i.e., quantifying knowledge points encoded in intermediate layers of a DNN for classification, based on the information theory. To this end, we consider the signal processing in a DNN as the layer-wise information discarding. A knowledge point is referred to as an input unit, whose information is much less discarded than other input units. Thus, we propose three hypotheses for knowledge distillation based on the quantification of knowledge points. 1. The DNN learning from knowledge distillation encodes more knowledge points than the DNN learning from scratch. 2. Knowledge distillation makes the DNN more likely to learn different knowledge points simultaneously. In comparison, the DNN learning from scratch tends to encode various knowledge points sequentially. 3. The DNN learning from knowledge distillation is often optimized more stably than the DNN learning from scratch. In order to verify the above hypotheses, we design three types of metrics with annotations of foreground objects to analyze feature representations of the DNN, \textit{i.e.} the quantity and the quality of knowledge points, the learning speed of different knowledge points, and the stability of optimization directions. In experiments, we diagnosed various DNNs for different classification tasks, i.e., image classification, 3D point cloud classification, binary sentiment classification, and question answering, which verified above hypotheses.

preprint2022arXiv

SCR: Training Graph Neural Networks with Consistency Regularization

We present the SCR framework for enhancing the training of graph neural networks (GNNs) with consistency regularization. Regularization is a set of strategies used in Machine Learning to reduce overfitting and improve the generalization ability. However, it is unclear how to best design the generalization strategies in GNNs, as it works in a semi-supervised setting for graph data. The major challenge lies in how to efficiently balance the trade-off between the error from the labeled data and that from the unlabeled data. SCR is a simple yet general framework in which we introduce two strategies of consistency regularization to address the challenge above. One is to minimize the disagreements among the perturbed predictions by different versions of a GNN model. The other is to leverage the Mean Teacher paradigm to estimate a consistency loss between teacher and student models instead of the disagreement of the predictions. We conducted experiments on three large-scale node classification datasets in the Open Graph Benchmark (OGB). Experimental results demonstrate that the proposed SCR framework is a general one that can enhance various GNNs to achieve better performance. Finally, SCR has been the top-1 entry on all three OGB leaderboards as of this submission.

preprint2022arXiv

Some rigidity properties for $λ$-self-expanders

$λ$-self-expanders $Σ$ in $\mathbb{R}^{n+1}$ are the solutions of the isoperimetric problem with respect to the same weighted area form as in the study of the self-expanders. In this paper, we mainly extend the results on self-expanders which we obtained in \cite{ancari2020volum} to $λ$-self-expanders. We prove some results that characterize the hyperplanes, spheres and cylinders as $λ$-self-expanders. We also discuss the area growths and the finiteness of the weighted areas under the control of the growth of the mean curvature.

preprint2022arXiv

Why Adversarial Training of ReLU Networks Is Difficult?

This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training.

preprint2021arXiv

Building Interpretable Interaction Trees for Deep NLP Models

This paper proposes a method to disentangle and quantify interactions among words that are encoded inside a DNN for natural language processing. We construct a tree to encode salient interactions extracted by the DNN. Six metrics are proposed to analyze properties of interactions between constituents in a sentence. The interaction is defined based on Shapley values of words, which are considered as an unbiased estimation of word contributions to the network prediction. Our method is used to quantify word interactions encoded inside the BERT, ELMo, LSTM, CNN, and Transformer networks. Experimental results have provided a new perspective to understand these DNNs, and have demonstrated the effectiveness of our method.

preprint2020arXiv

DangKiller: Eliminating Dangling Pointers Efficiently via Implicit Identifier

Use-After-Free vulnerabilities, allowing the attacker to access unintended memory via dangling pointers, are more threatening. However, most detection schemes can only detect dangling pointers and invalid them, but not provide a tolerance mechanism to repair the errors at runtime. Also, these techniques obtain and manage the metadata inefficiently with complex structures and too much scan (sweep). The goal of this paper is to use compiler instrumentation to eliminate dangling pointers automatically and efficiently. In this paper, we observe that most techniques lack accurate efficient pointer graph metadata maintaining methods, so they need to scan the log to reduce the redundancy and sweep the whole address space to find dangling pointers. Also, they lack a direct, efficiently obtaining metadata approach. The key insight of this paper is that a unique identifier can be used as a key to a hash or direct-map algorithm. Thus, this paper maintains the same implicit identifier with each memory object and its corresponding referent. Associating the unique ID with metadata for memory objects, obtaining and managing the pointer graph metadata can be efficiently. Therefore, with the delayed free technique adopted into C/C++, we present the DangKiller as a novel and lightweight dangling pointer elimination solution. We first demonstrate the MinFat Pointer, which can calculate unique implicit ID for each object and pointer quickly, and use hash algorithm to obtain metadata. Secondly, we propose the Log Cache and Log Compression mechanism based on the ID to decrease the redundancy of dangling pointer candidates. Coupled with the Address Tagging architecture on an ARM64 system, our experiments show that the DangKiller can eliminate use-after-free vulnerabilities at only 11% and 3% runtime overheads for the SPEC CPU2006 and 2017 benchmarks respectively, except for unique cases.

preprint2020arXiv

Explaining Knowledge Distillation by Quantifying the Knowledge

This paper presents a method to interpret the success of knowledge distillation by quantifying and analyzing task-relevant and task-irrelevant visual concepts that are encoded in intermediate layers of a deep neural network (DNN). More specifically, three hypotheses are proposed as follows. 1. Knowledge distillation makes the DNN learn more visual concepts than learning from raw data. 2. Knowledge distillation ensures that the DNN is prone to learning various visual concepts simultaneously. Whereas, in the scenario of learning from raw data, the DNN learns visual concepts sequentially. 3. Knowledge distillation yields more stable optimization directions than learning from raw data. Accordingly, we design three types of mathematical metrics to evaluate feature representations of the DNN. In experiments, we diagnosed various DNNs, and above hypotheses were verified.

preprint2020arXiv

Rotation-Equivariant Neural Networks for Privacy Protection

In order to prevent leaking input information from intermediate-layer features, this paper proposes a method to revise the traditional neural network into the rotation-equivariant neural network (RENN). Compared to the traditional neural network, the RENN uses d-ary vectors/tensors as features, in which each element is a d-ary number. These d-ary features can be rotated (analogous to the rotation of a d-dimensional vector) with a random angle as the encryption process. Input information is hidden in this target phase of d-ary features for attribute obfuscation. Even if attackers have obtained network parameters and intermediate-layer features, they cannot extract input information without knowing the target phase. Hence, the input privacy can be effectively protected by the RENN. Besides, the output accuracy of RENNs only degrades mildly compared to traditional neural networks, and the computational cost is significantly less than the homomorphic encryption.

preprint2020arXiv

S3Library: Automatically Eliminating C/C++ Buffer Overflow using Compatible Safer Libraries

Annex K of C11, bounds-checking interfaces, recently introduced a set of alternative functions to mitigate buffer overflows, primarily those caused by string/memory functions. However, poor compatibility limits their adoption. Failure oblivious computing can eliminate the possibility that an attacker can exploit memory errors to corrupt the address space and significantly increase the availability of systems. In this paper, we present S3Library (Saturation-Memory-Access Safer String Library), which is compatible with the standard C library in terms of function signature. Our technique automatically replaces unsafe deprecated memory/string functions with safer versions that perform bounds checking and eliminate buffer overflows via boundless memory. S3Library employs MinFat, a very compact pointer representation following the Less is More principle, to encode metadata into unused upper bits within pointers. In addition, S3Library utilizes Saturation Memory Access to eliminate illegal memory accesses into boundless padding area. Even if an out-of-bounds access is made, the fault program will not be interrupted. We implement our scheme within the LLVM framework on X86-64 and evaluate our approach on correctness, security, runtime performance and availability.

preprint2020arXiv

Saturation Memory Access: Mitigating Memory Spatial Errors without Terminating Programs

Memory spatial errors, i.e., buffer overflow vulnerabilities, have been a well-known issue in computer security for a long time and remain one of the root causes of exploitable vulnerabilities. Most of the existing mitigation tools adopt a fail-stop strategy to protect programs from intrusions, which means the victim program will be terminated upon detecting a memory safety violation. Unfortunately, the fail-stop strategy harms the availability of software. In this paper, we propose Saturation Memory Access (SMA), a memory spatial error mitigation mechanism that prevents out-of-bounds access without terminating a program. SMA is based on a key observation that developers generally do not rely on out-of-bounds accesses to implement program logic. SMA modifies dynamic memory allocators and adds paddings to objects to form an enlarged object boundary. By dynamically correcting all the out-of-bounds accesses to operate on the enlarged protecting boundaries, SMA can tolerate out-of-bounds accesses. For the sake of compatibility, we chose tagged pointers to record the boundary metadata of a memory object in the pointer itself, and correct the address upon detecting out-of-bounds access. We have implemented the prototype of SMA on LLVM 10.0. Our results show that our compiler enables the programs to execute successfully through buffer overflow attacks. Experiments on MiBench show that our prototype incurs an overhead of 78\%. Further optimizations would require ISA supports.

preprint2020arXiv

Volume growth estimates for Ricci solitons and quasi-Einstein manifolds

In this article, we provide some volume growth estimates for complete noncompact gradient Ricci solitons and quasi-Einstein manifolds similar to the classical results by Bishop, Calabi and Yau for complete Riemannian manifolds with nonnegative Ricci curvature. We prove a sharp volume growth estimate for complete noncompact gradient shrinking Ricci soliton. Moreover, we provide upper bound volume growth estimates for complete noncompact quasi-Einstein manifolds with $λ=0.$ In addition, we prove that geodesic balls of complete noncompact quasi-Einstein manifolds with $λ<0$ and $μ\leq 0$ have at most exponential volume growth.