Source author record

Pengfei Xia

Pengfei Xia appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory Machine Learning math.IT Artificial Intelligence Computer Vision Cryptography and Security Computation and Language eess.IV eess.SP eess.SY Multiagent Systems Systems and Control

Catalog footprint

What is connected

11works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

When KV Cache Reuse Fails in Multi-Agent Systems: Cross-Candidate Interaction is Crucial for LLM Judges

Multi-agent LLM systems routinely generate multiple candidate responses that are aggregated by an LLM judge. To reduce the dominant prefill cost in such pipelines, recent work advocates KV cache reuse across partially shared contexts and reports substantial speedups for generation agents. In this work, we show that these efficiency gains do not transfer uniformly to judge-centric inference. Across GSM8K, MMLU, and HumanEval, we find that reuse strategies that are effective for execution agents can severely perturb judge behavior: end-task accuracy may appear stable, yet the judge's selection becomes highly inconsistent with dense prefill. We quantify this risk using Judge Consistency Rate (JCR) and provide diagnostics showing that reuse systematically weakens cross-candidate attention, especially for later candidate blocks. Our ablation further demonstrates that explicit cross-candidate interaction is crucial for preserving dense-prefill decisions. Overall, our results identify a previously overlooked failure mode of KV cache reuse and highlight judge-centric inference as a distinct regime that demands dedicated, risk-aware system design.

preprint2022arXiv

A New Perspective on Stabilizing GANs training: Direct Adversarial Training

Generative Adversarial Networks (GANs) are the most popular image generation models that have achieved remarkable progress on various computer vision tasks. However, training instability is still one of the open problems for all GAN-based algorithms. Quite a number of methods have been proposed to stabilize the training of GANs, the focuses of which were respectively put on the loss functions, regularization and normalization technologies, training algorithms, and model architectures. Different from the above methods, in this paper, a new perspective on stabilizing GANs training is presented. It is found that sometimes the images produced by the generator act like adversarial examples of the discriminator during the training process, which may be part of the reason causing the unstable training of GANs. With this finding, we propose the Direct Adversarial Training (DAT) method to stabilize the training process of GANs. Furthermore, we prove that the DAT method is able to minimize the Lipschitz constant of the discriminator adaptively. The advanced performance of DAT is verified on multiple loss functions, network architectures, hyper-parameters, and datasets. Specifically, DAT achieves significant improvements of 11.5% FID on CIFAR-100 unconditional generation based on SSGAN, 10.5% FID on STL-10 unconditional generation based on SSGAN, and 13.2% FID on LSUN-Bedroom unconditional generation based on SSGAN. Code will be available at https://github.com/iceli1007/DAT-GAN

preprint2022arXiv

Data-Efficient Backdoor Attacks

Recent studies have proven that deep neural networks are vulnerable to backdoor attacks. Specifically, by mixing a small number of poisoned samples into the training set, the behavior of the trained model can be maliciously controlled. Existing attack methods construct such adversaries by randomly selecting some clean data from the benign set and then embedding a trigger into them. However, this selection strategy ignores the fact that each poisoned sample contributes inequally to the backdoor injection, which reduces the efficiency of poisoning. In this paper, we formulate improving the poisoned data efficiency by the selection as an optimization problem and propose a Filtering-and-Updating Strategy (FUS) to solve it. The experimental results on CIFAR-10 and ImageNet-10 indicate that the proposed method is effective: the same attack success rate can be achieved with only 47% to 75% of the poisoned sample volume compared to the random selection strategy. More importantly, the adversaries selected according to one setting can generalize well to other settings, exhibiting strong transferability. The prototype code of our method is now available at https://github.com/xpf/Data-Efficient-Backdoor-Attacks.

preprint2022arXiv

Enhancing Backdoor Attacks with Multi-Level MMD Regularization

While Deep Neural Networks (DNNs) excel in many tasks, the huge training resources they require become an obstacle for practitioners to develop their own models. It has become common to collect data from the Internet or hire a third party to train models. Unfortunately, recent studies have shown that these operations provide a viable pathway for maliciously injecting hidden backdoors into DNNs. Several defense methods have been developed to detect malicious samples, with the common assumption that the latent representations of benign and malicious samples extracted by the infected model exhibit different distributions. However, a comprehensive study on the distributional differences is missing. In this paper, we investigate such differences thoroughly via answering three questions: 1) What are the characteristics of the distributional differences? 2) How can they be effectively reduced? 3) What impact does this reduction have on difference-based defense methods? First, the distributional differences of multi-level representations on the regularly trained backdoored models are verified to be significant by introducing Maximum Mean Discrepancy (MMD), Energy Distance (ED), and Sliced Wasserstein Distance (SWD) as the metrics. Then, ML-MMDR, a difference reduction method that adds multi-level MMD regularization into the loss, is proposed, and its effectiveness is testified on three typical difference-based defense methods. Across all the experimental settings, the F1 scores of these methods drop from 90%-100% on the regularly trained backdoored models to 60%-70% on the models trained with ML-MMDR. These results indicate that the proposed MMD regularization can enhance the stealthiness of existing backdoor attack methods. The prototype code of our method is now available at https://github.com/xpf/Multi-Level-MMD-Regularization.

preprint2022arXiv

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

Despite achieving great success, Deep Neural Networks (DNNs) are vulnerable to adversarial examples. How to accurately evaluate the adversarial robustness of DNNs is critical for their deployment in real-world applications. An ideal indicator of robustness is adversarial risk. Unfortunately, since it involves maximizing the 0-1 loss, calculating the true risk is technically intractable. The most common solution for this is to compute an approximate risk by replacing the 0-1 loss with a surrogate one. Some functions have been used, such as Cross-Entropy (CE) loss and Difference of Logits Ratio (DLR) loss. However, these functions are all manually designed and may not be well suited for adversarial robustness evaluation. In this paper, we leverage AutoML to tighten the error (gap) between the true and approximate risks. Our main contributions are as follows. First, AutoLoss-AR, the first method to search for surrogate losses for adversarial risk, with an elaborate search space, is proposed. The experimental results on 10 adversarially trained models demonstrate the effectiveness of the proposed method: the risks evaluated using the best-discovered losses are 0.2% to 1.6% better than those evaluated using the handcrafted baselines. Second, 5 surrogate losses with clean and readable formulas are distilled out and tested on 7 unseen adversarially trained models. These losses outperform the baselines by 0.8% to 2.4%, indicating that they can be used individually as some kind of new knowledge. Besides, the possible reasons for the better performance of these losses are explored.

preprint2021arXiv

Understanding the Error in Evaluating Adversarial Robustness

Deep neural networks are easily misled by adversarial examples. Although lots of defense methods are proposed, many of them are demonstrated to lose effectiveness when against properly performed adaptive attacks. How to evaluate the adversarial robustness effectively is important for the realistic deployment of deep models, but yet still unclear. To provide a reasonable solution, one of the primary things is to understand the error (or gap) between the true adversarial robustness and the evaluated one, what is it and why it exists. Several works are done in this paper to make it clear. Firstly, we introduce an interesting phenomenon named gradient traps, which lead to incompetent adversaries and are demonstrated to be a manifestation of evaluation error. Then, we analyze the error and identify that there are three components. Each of them is caused by a specific compromise. Moreover, based on the above analysis, we present our evaluation suggestions. Experiments on adversarial training and its variations indicate that: (1) the error does exist empirically, and (2) these defenses are still vulnerable. We hope these analyses and results will help the community to develop more powerful defenses.

preprint2020arXiv

Adaptive Distributed Laser Charging for Efficient Wireless Power Transfer

Distributed laser charging (DLC) is a wireless power transfer technology for mobile electronics. Similar to traditional wireless charging systems, the DLC system can only provide constant power to charge a battery. However, Li-ion battery needs dynamic input current and voltage, thus power, in order to optimize battery charging performance. Therefore, neither power transmission efficiency nor battery charging performance can be optimized by the DLC system. We at first propose an adaptive DLC (ADLC) system to optimize wireless power transfer efficiency and battery charging performance. Then, we analyze ADLC's power conversion to depict the adaptation mechanism. Finally, we evaluate the ADLC's power conversion performance by simulation, which illustrates its efficiency improvement by saving at least 60.4% of energy, comparing with the fixed-power charging system.

preprint2016arXiv

Codebook Design for Millimeter-Wave Channel Estimation with Hybrid Precoding Structure

In this paper, we study hierarchical codebook design for channel estimation in millimeter-wave (mmWave) communications with a hybrid precoding structure. Due to the limited saturation power of mmWave power amplifier (PA), we take the per-antenna power constraint (PAPC) into consideration. We first propose a metric, i.e., generalized detection probability (GDP), to evaluate the quality of \emph{an arbitrary codeword}. This metric not only enables an optimization approach for mmWave codebook design, but also can be used to compare the performance of two different codewords/codebooks. To the best of our knowledge, GDP is the first metric particularly for mmWave codebook design for channel estimation. We then propose an approach to design a hierarchical codebook exploiting BeaM Widening with Multi-RF-chain Sub-array technique (BMW-MS). To obtain crucial parameters of BMW-MS, we provide two solutions, namely a low-complexity search (LCS) solution to optimize the GDP metric and a closed-form (CF) solution to pursue a flat beam pattern. Performance comparisons show that BMW-MS/LCS and BMW-MS/CF achieve very close performances, and they outperform the existing alternatives under the PAPC.

preprint2016arXiv

Enabling UAV Cellular with Millimeter-Wave Communication: Potentials and Approaches

To support high data rate urgent or ad hoc communications, we consider mmWave UAV cellular networks and the associated challenges and solutions. To enable fast beamforming training and tracking, we first investigate a hierarchical structure of beamforming codebooks and design of hierarchical codebooks with different beam widths via the sub-array techniques. We next examine the Doppler effect as a result of UAV movement and find that the Doppler effect may not be catastrophic when high gain directional transmission is used. We further explore the use of millimeter wave spatial division multiple access and demonstrate its clear advantage in improving the cellular network capacity. We also explore different ways of dealing with signal blockage and point out that possible adaptive UAV cruising algorithms would be necessary to counteract signal blockage. Finally, we identify a close relationship between UAV positioning and directional millimeter wave user discovery, where update of the former may directly impact the latter and vice versa.

preprint2016arXiv

Hierarchical Codebook Design for Beamforming Training in Millimeter-Wave Communication

In millimeter-wave communication, large antenna arrays are required to achieve high power gain by steering towards each other with narrow beams, which poses the problem to efficiently search the best beam direction in the angle domain at both Tx and Rx sides. As the exhaustive search is time consuming, hierarchical search has been widely accepted to reduce the complexity, and its performance is highly dependent on the codebook design. In this paper, we propose two basic criteria for the hierarchical codebook design, and devise an efficient hierarchical codebook by jointly exploiting sub-array and deactivation (turning-off) antenna processing techniques, where closed-form expressions are provided to generate the codebook. Performance evaluations are conducted under different system and channel models. Results show superiority of the proposed codebook over the existing alternatives.

preprint2016arXiv

Low Complexity Hybrid Precoding and Channel Estimation Based on Hierarchical Multi-Beam Search for Millimeter-Wave MIMO Systems

In millimeter-wave (mmWave) MIMO systems, while a hybrid digital/analog precoding structure offers the potential to increase the achievable rate, it also faces the challenge of the need of a low-complexity design. In specific, the hybrid precoding may require matrix operations with a scale of antenna size, which is generally large in mmWave communication. Moreover, the channel estimation is also rather time consuming due to the large number of antennas at both Tx/Rx sides. In this paper, a low-complexity hybrid precoding and channel estimation approach is proposed. In the channel estimation phase, a hierarchical multi-beam search scheme is proposed to fast acquire $N_{\rm{S}}$ (the number of streams) multipath components (MPCs)/clusters with the highest powers. In the hybrid precoding phase, the analog and digital precodings are decoupled. The analog precoding is designed to steer along the $N_{\rm{S}}$ acquired MPCs/clusters at both Tx/Rx sides, shaping an equivalent $N_{\rm{S}}\times N_{\rm{S}}$ baseband channel, while the digital precoding performs operations in the baseband with the reduced-scale channel. Performance evaluations show that, compared with a state-of-the-art scheme, while achieving a close or even better performance when the number of radio-frequency (RF) chains or streams is small, both the computational complexity of the hybrid precoding and the time complexity of the channel estimation are greatly reduced.

Pengfei Xia

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

When KV Cache Reuse Fails in Multi-Agent Systems: Cross-Candidate Interaction is Crucial for LLM Judges

A New Perspective on Stabilizing GANs training: Direct Adversarial Training

Data-Efficient Backdoor Attacks

Enhancing Backdoor Attacks with Multi-Level MMD Regularization

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

Understanding the Error in Evaluating Adversarial Robustness

Adaptive Distributed Laser Charging for Efficient Wireless Power Transfer

Codebook Design for Millimeter-Wave Channel Estimation with Hybrid Precoding Structure

Enabling UAV Cellular with Millimeter-Wave Communication: Potentials and Approaches

Hierarchical Codebook Design for Beamforming Training in Millimeter-Wave Communication

Low Complexity Hybrid Precoding and Channel Estimation Based on Hierarchical Multi-Beam Search for Millimeter-Wave MIMO Systems