Researcher profile

Qing Ye

Qing Ye contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

ForgeVLA: Federated Vision-Language-Action Learning without Language Annotations

Vision-Language-Action (VLA) models hold great promise for general-purpose robotic intelligence, yet scaling up such models is severely bottlenecked by the high cost of acquiring annotated training data. Fortunately, vision-equipped robots deployed across various domains already produce abundant vision-action pairs that can be leveraged to scale up VLA training more efficiently. However, these raw data cannot be centrally aggregated due to various constraints and also exhibit severe heterogeneity. To address these challenges, in this paper, we propose ForgeVLA, a federated VLA training framework that learns VLA models from distributed vision-action pairs without centralizing raw data or requiring manual annotations. Specifically, each client in ForgeVLA is equipped with an embodied instruction classifier that maps vision-action pairs to a predefined instruction set, recovering the missing language modality and forming complete vision-language-action triplets. Beyond triplet construction, we also identify vision-language feature collapse as a critical challenge that has been largely overlooked in prior federated VLA research. To mitigate this issue, ForgeVLA combines a client-side contrastive planning loss with a server-side adaptive aggregation strategy to learn task-discriminative representations efficiently. Extensive experiments across multiple benchmarks show that ForgeVLA significantly outperforms other baselines, and ablation studies further validate the contribution of each component.

preprint2022arXiv

Deep Learning for 1-Bit Compressed Sensing-based Superimposed CSI Feedback

In frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems, 1-bit compressed sensing (CS)-based superimposed channel state information (CSI) feedback has shown many advantages, while still faces many challenges, such as low accuracy of the downlink CSI recovery and large processing delays. To overcome these drawbacks, this paper proposes a deep learning (DL) scheme to improve the 1-bit compressed sensing-based superimposed CSI feedback. On the user side, the downlink CSI is compressed with the 1-bit CS technique, superimposed on the uplink user data sequences (UL-US), and then sent back to the base station (BS). At the BS, based on the model-driven approach and assisted by the superimposition-interference cancellation technology, a multi-task detection network is first constructed for detecting both the UL-US and downlink CSI. In particular, this detection network is jointly trained to detect the UL-US and downlink CSI simultaneously, capturing a globally optimized network parameter. Then, with the recovered bits for the downlink CSI, a lightweight reconstruction scheme, which consists of an initial feature extraction of the downlink CSI with the simplified traditional method and a single hidden layer network, is utilized to reconstruct the downlink CSI with low processing delay. Compared with the 1-bit CS-based superimposed CSI feedback scheme, the proposed scheme improves the recovery accuracy of the UL-US and downlink CSI with lower processing delay and possesses robustness against parameter variations.

preprint2022arXiv

DeFTA: A Plug-and-Play Decentralized Replacement for FedAvg

Federated learning (FL) is identified as a crucial enabler for large-scale distributed machine learning (ML) without the need for local raw dataset sharing, substantially reducing privacy concerns and alleviating the isolated data problem. In reality, the prosperity of FL is largely due to a centralized framework called FedAvg, in which workers are in charge of model training and servers are in control of model aggregation. However, FedAvg's centralized worker-server architecture has raised new concerns, be it the low scalability of the cluster, the risk of data leakage, and the failure or even defection of the central server. To overcome these problems, we propose Decentralized Federated Trusted Averaging (DeFTA), a decentralized FL framework that serves as a plug-and-play replacement for FedAvg, instantly bringing better security, scalability, and fault-tolerance to the federated learning process after installation. In principle, it fundamentally resolves the above-mentioned issues from an architectural perspective without compromises or tradeoffs, primarily consisting of a new model aggregating formula with theoretical performance analysis, and a decentralized trust system (DTS) to greatly improve system robustness. Note that since DeFTA is an alternative to FedAvg at the framework level, \textit{prevalent algorithms published for FedAvg can be also utilized in DeFTA with ease}. Extensive experiments on six datasets and six basic models suggest that DeFTA not only has comparable performance with FedAvg in a more realistic setting, but also achieves great resilience even when 66% of workers are malicious. Furthermore, we also present an asynchronous variant of DeFTA to endow it with more powerful usability.

preprint2022arXiv

Fusion Learning for 1-Bit CS-based Superimposed CSI Feedback with Bi-Directional Channel Reciprocity

Due to the discarding of downlink channel state information (CSI) amplitude and the employing of iteration reconstruction algorithms, 1-bit compressed sensing (CS)-based superimposed CSI feedback is challenged by low recovery accuracy and large processing delay. To overcome these drawbacks, this letter proposes a fusion learning scheme by exploiting the bi-directional channel reciprocity. Specifically, a simplified version of the conventional downlink CSI reconstruction is utilized to extract the initial feature of downlink CSI, and a single hidden layer-based amplitude-learning network (AMPL-NET) is designed to learn the auxiliary feature of the downlink CSI amplitude. Then, based on the extracted and learned amplitude features, a simple but effective amplitude-fusion network (AMPF-NET) is developed to perform the amplitude fusion of downlink CSI and thus improves the reconstruction accuracy for 1-bit CS-based superimposed CSI feedback while reducing the processing delay. Simulation results show the effectiveness of the proposed feedback scheme and the robustness against parameter variations.

preprint2022arXiv

Second-Order Conic and Polyhedral Approximations of the Exponential Cone: Application to Mixed-Integer Exponential Conic Programs

Exponents and logarithms are fundamental components in many important applications such as logistic regression, maximum likelihood, relative entropy, and so on. Since the exponential cone can be viewed as the epigraph of perspective of the natural exponential function or the hypograph of perspective of the natural logarithm function, many mixed-integer convex programs involving exponential or logarithm functions can be recast as mixed-integer exponential conic programs (MIECPs). However, unlike mixed-integer linear programs (MILPs) and mixed-integer second-order conic programs (MISOCPs), MIECPs are still under development. To harvest the past efforts on MILPs and MISOCPs, this paper presents second-order conic (SOC) and polyhedral approximation schemes for the exponential cone with application to MIECPs. To do so, we first extend and generalize existing SOC approximation approaches in the extended space, propose new scaling and shifting methods, prove approximation accuracies, and derive lower bounds of approximations. We then study the polyhedral outer approximation of the exponential cones in the original space using gradient inequalities, show its approximation accuracy, and derive a lower bound of the approximation. When implementing SOC approximations, we suggest learning the approximation pattern by testing smaller cases and then applying it to the large-scale ones; and for the polyhedral approximation, we suggest using the branch and cut method for MIECPs. Our numerical study shows that the proposed methods show speed-ups over solver MOSEK for MIECPs, and the scaling, shifting, and polyhedral outer approximation methods work very well.

preprint2020arXiv

PSO-PS: Parameter Synchronization with Particle Swarm Optimization for Distributed Training of Deep Neural Networks

Parameter updating is an important stage in parallelism-based distributed deep learning. Synchronous methods are widely used in distributed training the Deep Neural Networks (DNNs). To reduce the communication and synchronization overhead of synchronous methods, decreasing the synchronization frequency (e.g., every $n$ mini-batches) is a straightforward approach. However, it often suffers from poor convergence. In this paper, we propose a new algorithm of integrating Particle Swarm Optimization (PSO) into the distributed training process of DNNs to automatically compute new parameters. In the proposed algorithm, a computing work is encoded by a particle, the weights of DNNs and the training loss are modeled by the particle attributes. At each synchronization stage, the weights are updated by PSO from the sub weights gathered from all workers, instead of averaging the weights or the gradients. To verify the performance of the proposed algorithm, the experiments are performed on two commonly used image classification benchmarks: MNIST and CIFAR10, and compared with the peer competitors at multiple different synchronization configurations. The experimental results demonstrate the competitiveness of the proposed algorithm.