Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
45works
0followers
22topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

45 published item(s)

preprint2026arXiv

Federated Customization of Large Models: Approaches, Experiments, and Insights

In this article, we explore federated customization of large models and highlight the key challenges it poses within the federated learning framework. We review several popular large model customization techniques, including full fine-tuning, efficient fine-tuning, prompt engineering, prefix-tuning, knowledge distillation, and retrieval-augmented generation. Then, we discuss how these techniques can be implemented within the federated learning framework. Moreover, we conduct experiments on federated prefix-tuning, which, to the best of our knowledge, is the first trial to apply prefix-tuning in the federated learning setting. The conducted experiments validate its feasibility with performance close to centralized approaches. Further comparison with three other federated customization methods demonstrated its competitive performance, satisfactory efficiency, and consistent robustness.

preprint2026arXiv

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Recent advances in coding agents suggest rapid progress toward autonomous software development, yet existing benchmarks fail to rigorously evaluate the long-horizon capabilities required to build complete software systems. Most prior evaluations focus on localized code generation, scaffolded completion, or short-term repair tasks, leaving open the question of whether agents can sustain coherent reasoning, planning, and execution over the extended horizons demanded by real-world repository construction. To address this gap, we present NL2Repo Bench, a benchmark explicitly designed to evaluate the long-horizon repository generation ability of coding agents. Given only a single natural-language requirements document and an empty workspace, agents must autonomously design the architecture, manage dependencies, implement multi-module logic, and produce a fully installable Python library. Our experiments across state-of-the-art open- and closed-source models reveal that long-horizon repository generation remains largely unsolved: even the strongest agents achieve below 40% average test pass rates and rarely complete an entire repository correctly. Detailed analysis uncovers fundamental long-horizon failure modes, including premature termination, loss of global coherence, fragile cross-file dependencies, and inadequate planning over hundreds of interaction steps. NL2Repo Bench establishes a rigorous, verifiable testbed for measuring sustained agentic competence and highlights long-horizon reasoning as a central bottleneck for the next generation of autonomous coding agents.

preprint2026arXiv

SwarmFoam: An OpenFOAM Multi-Agent System Based on Multiple Types of Large Language Models

Numerical simulation is one of the mainstream methods in scientific research, typically performed by professional engineers. With the advancement of multi-agent technology, using collaborating agents to replicate human behavior shows immense potential for intelligent Computational Fluid Dynamics (CFD) simulations. Some muti-agent systems based on Large Language Models have been proposed. However, they exhibit significant limitations when dealing with complex geometries. This paper introduces a new multi-agent simulation framework, SwarmFoam. SwarmFoam integrates functionalities such as Multi-modal perception, Intelligent error correction, and Retrieval-Augmented Generation, aiming to achieve more complex simulations through dual parsing of images and high-level instructions. Experimental results demonstrate that SwarmFoam has good adaptability to simulation inputs from different modalities. The overall pass rate for 25 test cases was 84%, with natural language and multi-modal input cases achieving pass rates of 80% and 86.7%, respectively. The work presented by SwarmFoam will further promote the development of intelligent agent methods for CFD.

preprint2026arXiv

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Visual generative models have achieved remarkable progress in synthesizing photorealistic images and videos, yet aligning their outputs with human preferences across critical dimensions remains a persistent challenge. Though reinforcement learning from human feedback offers promise for preference alignment, existing reward models for visual generation face limitations, including black-box scoring without interpretability and potentially resultant unexpected biases. We present VisionReward, a general framework for learning human visual preferences in both image and video generation. Specifically, we employ a hierarchical visual assessment framework to capture fine-grained human preferences, and leverages linear weighting to enable interpretable preference learning. Furthermore, we propose a multi-dimensional consistent strategy when using VisionReward as a reward model during preference optimization for visual generation. Experiments show that VisionReward can significantly outperform existing image and video reward models on both machine metrics and human evaluation. Notably, VisionReward surpasses VideoScore by 17.2% in preference prediction accuracy, and text-to-video models with VisionReward achieve a 31.6% higher pairwise win rate compared to the same models using VideoScore. All code and datasets are provided at https://github.com/THUDM/VisionReward.

preprint2022arXiv

5G-Enabled Pseudonymity for Cooperative Intelligent Transportation System

Cooperative Intelligent Transportation Systems (C-ITS) enable communications between vehicles, road-side infrastructures, and road-users to improve users' safety and to efficiently manage traffic. Most, if not all, of the intelligent vehicles-to-everything (V2X) applications, often rely on continuous collection and sharing of sensitive information such as detailed location information which raises privacy concerns. In this light, a common approach to concealing the long-term identity of C-ITS vehicles is using multiple temporary identifiers, called pseudonyms. However, the legacy pseudonyms management approach is prone to linking attacks. The introduction of 5G network to V2X offers enhanced location accuracy, better clock synchronisation, improved modular service-based architecture, and enhanced security and privacy preservation controls. Motivated by the above enhancements, we study 5G-enabled pseudonyms for protecting vehicle identity privacy in C-ITS. We highlight the gaps in the current standards of pseudonyms management. We further provide recommendations regarding the pseudonyms management life-cycle.

preprint2022arXiv

A Roadmap for Big Model

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

preprint2022arXiv

A Wearable ECG Monitor for Deep Learning Based Real-Time Cardiovascular Disease Detection

Cardiovascular disease has become one of the most significant threats endangering human life and health. Recently, Electrocardiogram (ECG) monitoring has been transformed into remote cardiac monitoring by Holter surveillance. However, the widely used Holter can bring a great deal of discomfort and inconvenience to the individuals who carry them. We developed a new wireless ECG patch in this work and applied a deep learning framework based on the Convolutional Neural Network (CNN) and Long Short-term Memory (LSTM) models. However, we find that the models using the existing techniques are not able to differentiate two main heartbeat types (Supraventricular premature beat and Atrial fibrillation) in our newly obtained dataset, resulting in low accuracy of 58.0 %. We proposed a semi-supervised method to process the badly labelled data samples with using the confidence-level-based training. The experiment results conclude that the proposed method can approach an average accuracy of 90.2 %, i.e., 5.4 % higher than the accuracy of conventional ECG classification methods.

preprint2022arXiv

CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

Large-scale pretrained transformers have created milestones in text (GPT-3) and text-to-image (DALL-E and CogView) generation. Its application to video generation is still facing many challenges: The potential huge computation cost makes the training from scratch unaffordable; The scarcity and weak relevance of text-video datasets hinder the model understanding complex movement semantics. In this work, we present 9B-parameter transformer CogVideo, trained by inheriting a pretrained text-to-image model, CogView2. We also propose multi-frame-rate hierarchical training strategy to better align text and video clips. As (probably) the first open-source large-scale pretrained text-to-video model, CogVideo outperforms all publicly available models at a large margin in machine and human evaluations.

preprint2022arXiv

CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers

The development of the transformer-based text-to-image models are impeded by its slow generation and complexity for high-resolution images. In this work, we put forward a solution based on hierarchical transformers and local parallel auto-regressive generation. We pretrain a 6B-parameter transformer with a simple and flexible self-supervised task, Cross-modal general language model (CogLM), and finetune it for fast super-resolution. The new text-to-image system, CogView2, shows very competitive generation compared to concurrent state-of-the-art DALL-E-2, and naturally supports interactive text-guided editing on images.

preprint2022arXiv

FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding

The few-shot natural language understanding (NLU) task has attracted much recent attention. However, prior methods have been evaluated under a disparate set of protocols, which hinders fair comparison and measuring progress of the field. To address this issue, we introduce an evaluation framework that improves previous evaluation procedures in three key aspects, i.e., test performance, dev-test correlation, and stability. Under this new evaluation framework, we re-evaluate several state-of-the-art few-shot methods for NLU tasks. Our framework reveals new insights: (1) both the absolute performance and relative gap of the methods were not accurately estimated in prior literature; (2) no single method dominates most tasks with consistent performance; (3) improvements of some methods diminish with a larger pretrained model; and (4) gains from different methods are often complementary and the best combined model performs close to a strong fully-supervised baseline. We open-source our toolkit, FewNLU, that implements our evaluation framework along with a number of state-of-the-art methods.

preprint2022arXiv

Generalized quantum cluster algebras: the Laurent phenomenon and upper bounds

Generalized quantum cluster algebras introduced in [1] are quantum deformation of generalized cluster algebras of geometric types. In this paper, we prove that the Laurent phenomenon holds in these generalized quantum cluster algebras. We also show that upper bounds coincide with the corresponding generalized quantum upper cluster algebras under the "coprimality" condition.

preprint2022arXiv

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

There have been various types of pretraining architectures including autoencoding models (e.g., BERT), autoregressive models (e.g., GPT), and encoder-decoder models (e.g., T5). However, none of the pretraining frameworks performs the best for all tasks of three main categories including natural language understanding (NLU), unconditional generation, and conditional generation. We propose a General Language Model (GLM) based on autoregressive blank infilling to address this challenge. GLM improves blank filling pretraining by adding 2D positional encodings and allowing an arbitrary order to predict spans, which results in performance gains over BERT and T5 on NLU tasks. Meanwhile, GLM can be pretrained for different types of tasks by varying the number and lengths of blanks. On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1.25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

preprint2022arXiv

Hardness Results for Laplacians of Simplicial Complexes via Sparse-Linear Equation Complete Gadgets

We study linear equations in combinatorial Laplacians of $k$-dimensional simplicial complexes ($k$-complexes), a natural generalization of graph Laplacians. Combinatorial Laplacians play a crucial role in homology and are a central tool in topology. Beyond this, they have various applications in data analysis and physical modeling problems. It is known that nearly-linear time solvers exist for graph Laplacians. However, nearly-linear time solvers for combinatorial Laplacians are only known for restricted classes of complexes. This paper shows that linear equations in combinatorial Laplacians of 2-complexes are as hard to solve as general linear equations. More precisely, for any constant $c \geq 1$, if we can solve linear equations in combinatorial Laplacians of 2-complexes up to high accuracy in time $\tilde{O}((\# \text{ of nonzero coefficients})^c)$, then we can solve general linear equations with polynomially bounded integer coefficients and condition numbers up to high accuracy in time $\tilde{O}((\# \text{ of nonzero coefficients})^c)$. We prove this by a nearly-linear time reduction from general linear equations to combinatorial Laplacians of 2-complexes. Our reduction preserves the sparsity of the problem instances up to poly-logarithmic factors.

preprint2022arXiv

Intelligent Blockchain-based Edge Computing via Deep Reinforcement Learning: Solutions and Challenges

The convergence of mobile edge computing (MEC) and blockchain is transforming the current computing services in wireless Internet-of-Things networks, by enabling task offloading with security enhancement based on blockchain mining. Yet the existing approaches for these enabling technologies are isolated, providing only tailored solutions for specific services and scenarios. To fill this gap, we propose a novel cooperative task offloading and blockchain mining (TOBM) scheme for a blockchain-based MEC system, where each edge device not only handles computation tasks but also deals with block mining for improving system utility. To address the latency issues caused by the blockchain operation in MEC, we develop a new Proof-of-Reputation consensus mechanism based on a lightweight block verification strategy. To accommodate the highly dynamic environment and high-dimensional system state space, we apply a novel distributed deep reinforcement learning-based approach by using a multi-agent deep deterministic policy gradient algorithm. Experimental results demonstrate the superior performance of the proposed TOBM scheme in terms of enhanced system reward, improved offloading utility with lower blockchain mining latency, and better system utility, compared to the existing cooperative and non-cooperative schemes. The paper concludes with key technical challenges and possible directions for future blockchain-based MEC research.

preprint2022arXiv

M6-UFC: Unifying Multi-Modal Controls for Conditional Image Synthesis via Non-Autoregressive Generative Transformers

Conditional image synthesis aims to create an image according to some multi-modal guidance in the forms of textual descriptions, reference images, and image blocks to preserve, as well as their combinations. In this paper, instead of investigating these control signals separately, we propose a new two-stage architecture, M6-UFC, to unify any number of multi-modal controls. In M6-UFC, both the diverse control signals and the synthesized image are uniformly represented as a sequence of discrete tokens to be processed by Transformer. Different from existing two-stage autoregressive approaches such as DALL-E and VQGAN, M6-UFC adopts non-autoregressive generation (NAR) at the second stage to enhance the holistic consistency of the synthesized image, to support preserving specified image blocks, and to improve the synthesis speed. Further, we design a progressive algorithm that iteratively improves the non-autoregressively generated image, with the help of two estimators developed for evaluating the compliance with the controls and evaluating the fidelity of the synthesized image, respectively. Extensive experiments on a newly collected large-scale clothing dataset M2C-Fashion and a facial dataset Multi-Modal CelebA-HQ verify that M6-UFC can synthesize high-fidelity images that comply with flexible multi-modal controls.

preprint2022arXiv

Negative-ResNet: Noisy Ambulatory Electrocardiogram Signal Classification Scheme

With recently successful applications of deep learning in computer vision and general signal processing, deep learning has shown many unique advantages in medical signal processing. However, data labelling quality has become one of the most significant issues for AI applications, especially when it requires domain knowledge (e.g. medical image labelling). In addition, there might be noisy labels in practical datasets, which might impair the training process of neural networks. In this work, we propose a semi-supervised algorithm for training data samples with noisy labels by performing selected Positive Learning (PL) and Negative Learning (NL). To verify the effectiveness of the proposed scheme, we designed a portable ECG patch -- iRealCare -- and applied the algorithm on a real-life dataset. Our experimental results show that we can achieve an accuracy of 91.0 %, which is 6.2 % higher than a normal training process with ResNet. There are 65 patients in our dataset and we randomly picked 2 patients to perform validation.

preprint2022arXiv

Path Planning for the Dynamic UAV-Aided Wireless Systems using Monte Carlo Tree Search

For UAV-aided wireless systems, online path planning attracts much attention recently. To better adapt to the real-time dynamic environment, we, for the first time, propose a Monte Carlo Tree Search (MCTS)-based path planning scheme. In details, we consider a single UAV acts as a mobile server to provide computation tasks offloading services for a set of mobile users on the ground, where the movement of ground users follows a Random Way Point model. Our model aims at maximizing the average throughput under energy consumption and user fairness constraints, and the proposed timesaving MCTS algorithm can further improve the performance. Simulation results show that the proposed algorithm achieves a larger average throughput and a faster convergence performance compared with the baseline algorithms of Q-learning and Deep Q-Network.

preprint2022arXiv

Rethinking the Setting of Semi-supervised Learning on Graphs

We argue that the present setting of semisupervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To explore the limit of over-tuning hyperparameters, we propose ValidUtil, an approach to fully utilize the label information in the validation set through an extra group of hyper-parameters. With ValidUtil, even GCN can easily get high accuracy of 85.8% on Cora. To avoid over-tuning, we merge the training set and the validation set and construct an i.i.d. graph benchmark (IGB) consisting of 4 datasets. Each dataset contains 100 i.i.d. graphs sampled from a large graph to reduce the evaluation variance. Our experiments suggest that IGB is a more stable benchmark than previous datasets for semisupervised learning on graphs.

preprint2022arXiv

Two-Commodity Flow is Equivalent to Linear Programming under Nearly-Linear Time Reductions

We give a nearly-linear time reduction that encodes any linear program as a 2-commodity flow problem with only a small blow-up in size. Under mild assumptions similar to those employed by modern fast solvers for linear programs, our reduction causes only a polylogarithmic multiplicative increase in the size of the program and runs in nearly-linear time. Our reduction applies to high-accuracy approximation algorithms and exact algorithms. Given an approximate solution to the 2-commodity flow problem, we can extract a solution to the linear program in linear time with only a polynomial factor increase in the error. This implies that any algorithm that solves the 2-commodity flow problem can solve linear programs in essentially the same time. Given a directed graph with edge capacities and two source-sink pairs, the goal of the 2-commodity flow problem is to maximize the sum of the flows routed between the two source-sink pairs subject to edge capacities and flow conservation. A 2-commodity flow can be directly written as a linear program, and thus we establish a nearly-tight equivalence between these two classes of problems. Our proof follows the outline of Itai's polynomial-time reduction of a linear program to a 2-commodity flow problem (JACM'78). Itai's reduction shows that exactly solving 2-commodity flow and exactly solving linear programming are polynomial-time equivalent. We improve Itai's reduction to nearly preserve the problem representation size in each step. In addition, we establish an error bound for approximately solving each intermediate problem in the reduction, and show that the accumulated error is polynomially bounded. We remark that our reduction does not run in strongly polynomial time and that it is open whether 2-commodity flow and linear programming are equivalent in strongly polynomial time.

preprint2021arXiv

Analyzing the Impact of Molecular Re-Radiation on the MIMO Capacity in High-Frequency Bands

In this paper, we show how the absorption and re-radiation energy from molecules in the air can influence the Multiple Input Multiple Output (MIMO) performance in high-frequency bands, e.g., millimeter wave (mmWave) and terahertz. In more detail, some common atmosphere molecules, such as oxygen and water, can absorb and re-radiate energy in their natural resonance frequencies, such as 60 GHz, 180 GHz and 320 GHz. Hence, when hit by electromagnetic waves, molecules will get excited and absorb energy, which leads to an extra path loss and is known as molecular attenuation. Meanwhile, the absorbed energy will be re-radiated towards a random direction with a random phase. These re-radiated waves also interfere with the signal transmission. Although, the molecular re-radiation was mostly considered as noise in literature, recent works show that it is correlated to the main signal and can be viewed as a composition of multiple delayed or scattered signals. Such a phenomenon can provide non-line-of-sight (NLoS) paths in an environment that lacks scatterers, which increases spatial multiplexing and thus greatly enhances the performance of MIMO systems. Therefore in this paper, we explore the scattering model and noise models of molecular re-radiation to characterize the channel transfer function of the NLoS channels created by atmosphere molecules. Our simulation results show that the re-radiation can increase MIMO capacity up to 3 folds in mmWave and 6 folds in terahertz for a set of realistic transmit power, distance, and antenna numbers. We also show that in the high SNR, the re-radiation makes the open-loop precoding viable, which is an alternative to beamforming to avoid beam alignment sensitivity in high mobility applications.

preprint2021arXiv

Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks

Heterogeneous graph neural networks (HGNNs) have been blossoming in recent years, but the unique data processing and evaluation setups used by each work obstruct a full understanding of their advancements. In this work, we present a systematical reproduction of 12 recent HGNNs by using their official codes, datasets, settings, and hyperparameters, revealing surprising findings about the progress of HGNNs. We find that the simple homogeneous GNNs, e.g., GCN and GAT, are largely underestimated due to improper settings. GAT with proper inputs can generally match or outperform all existing HGNNs across various scenarios. To facilitate robust and reproducible HGNN research, we construct the Heterogeneous Graph Benchmark (HGB), consisting of 11 diverse datasets with three tasks. HGB standardizes the process of heterogeneous graph data splits, feature processing, and performance evaluation. Finally, we introduce a simple but very strong baseline Simple-HGN--which significantly outperforms all previous models on HGB--to accelerate the advancement of HGNNs in the future.

preprint2021arXiv

Covert Model Poisoning Against Federated Learning: Algorithm Design and Optimization

Federated learning (FL), as a type of distributed machine learning frameworks, is vulnerable to external attacks on FL models during parameters transmissions. An attacker in FL may control a number of participant clients, and purposely craft the uploaded model parameters to manipulate system outputs, namely, model poisoning (MP). In this paper, we aim to propose effective MP algorithms to combat state-of-the-art defensive aggregation mechanisms (e.g., Krum and Trimmed mean) implemented at the server without being noticed, i.e., covert MP (CMP). Specifically, we first formulate the MP as an optimization problem by minimizing the Euclidean distance between the manipulated model and designated one, constrained by a defensive aggregation rule. Then, we develop CMP algorithms against different defensive mechanisms based on the solutions of their corresponding optimization problems. Furthermore, to reduce the optimization complexity, we propose low complexity CMP algorithms with a slight performance degradation. In the case that the attacker does not know the defensive aggregation mechanism, we design a blind CMP algorithm, in which the manipulated model will be adjusted properly according to the aggregated model generated by the unknown defensive aggregation. Our experimental results demonstrate that the proposed CMP algorithms are effective and substantially outperform existing attack mechanisms.

preprint2021arXiv

Federated Learning for COVID-19 Detection with Generative Adversarial Networks in Edge Cloud Computing

COVID-19 has spread rapidly across the globe and become a deadly pandemic. Recently, many artificial intelligence-based approaches have been used for COVID-19 detection, but they often require public data sharing with cloud datacentres and thus remain privacy concerns. This paper proposes a new federated learning scheme, called FedGAN, to generate realistic COVID-19 images for facilitating privacy-enhanced COVID-19 detection with generative adversarial networks (GANs) in edge cloud computing. Particularly, we first propose a GAN where a discriminator and a generator based on convolutional neural networks (CNNs) at each edge-based medical institution alternatively are trained to mimic the real COVID-19 data distribution. Then, we propose a new federated learning solution which allows local GANs to collaborate and exchange learned parameters with a cloud server, aiming to enrich the global GAN model for generating realistic COVID-19 images without the need for sharing actual data. To enhance the privacy in federated COVID-19 data analytics, we integrate a differential privacy solution at each hospital institution. Moreover, we propose a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process with a new mining solution for low running latency. Simulations results demonstrate the superiority of our approach for COVID-19 detection over the state-of-the-art schemes.

preprint2021arXiv

IdentityDP: Differential Private Identification Protection for Face Images

Because of the explosive growth of face photos as well as their widespread dissemination and easy accessibility in social media, the security and privacy of personal identity information becomes an unprecedented challenge. Meanwhile, the convenience brought by advanced identity-agnostic computer vision technologies is attractive. Therefore, it is important to use face images while taking careful consideration in protecting people's identities. Given a face image, face de-identification, also known as face anonymization, refers to generating another image with similar appearance and the same background, while the real identity is hidden. Although extensive efforts have been made, existing face de-identification techniques are either insufficient in photo-reality or incapable of well-balancing privacy and utility. In this paper, we focus on tackling these challenges to improve face de-identification. We propose IdentityDP, a face anonymization framework that combines a data-driven deep neural network with a differential privacy (DP) mechanism. This framework encompasses three stages: facial representations disentanglement, $ε$-IdentityDP perturbation and image reconstruction. Our model can effectively obfuscate the identity-related information of faces, preserve significant visual similarity, and generate high-quality images that can be used for identity-agnostic computer vision tasks, such as detection, tracking, etc. Different from the previous methods, we can adjust the balance of privacy and utility through the privacy budget according to pratical demands and provide a diversity of results without pre-annotations. Extensive experiments demonstrate the effectiveness and generalization ability of our proposed anonymization framework.

preprint2021arXiv

User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization

Federated learning (FL), as a type of collaborative machine learning framework, is capable of preserving private data from mobile terminals (MTs) while training the data into useful models. Nevertheless, from a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs. To address this problem, we first make use of the concept of local differential privacy (LDP), and propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers. According to our analysis, the UDP framework can realize $(ε_{i}, δ_{i})$-LDP for the $i$-th MT with adjustable privacy protection levels by varying the variances of the artificial noise processes. We then derive a theoretical convergence upper-bound for the UDP algorithm. It reveals that there exists an optimal number of communication rounds to achieve the best learning performance. More importantly, we propose a communication rounds discounting (CRD) method. Compared with the heuristic search method, the proposed CRD method can achieve a much better trade-off between the computational complexity of searching and the convergence performance. Extensive experiments indicate that our UDP algorithm using the proposed CRD method can effectively improve both the training efficiency and model quality for the given privacy protection levels.

preprint2020arXiv

Control of Walking Assist Exoskeleton with Time-delay Based on the Prediction of Plantar Force

Many kinds of lower-limb exoskeletons were developed for walking assistance. However, when controlling these exoskeletons, time-delay due to the computation time and the communication delays is still a general problem. In this research, we propose a novel method to prevent the time-delay when controlling a walking assist exoskeleton by predicting the future plantar force and walking status. By using Long Short-Term Memory and a fully-connected network, the plantar force can be predicted using only data measured by inertial measurement unit sensors, not only during the walking period but also at the start and end of walking. From the predicted plantar force, the walking status and the desired assistance timing can also be determined. By considering the time-delay and sending the control commands beforehand, the exoskeleton can be moved precisely on the desired assistance timing. In experiments, the prediction accuracy of the plantar force and the assistance timing are confirmed. The performance of the proposed method is also evaluated by using the trained model to control the exoskeleton.

preprint2020arXiv

DroneCells: Improving 5G Spectral Efficiency using Drone-mounted Flying Base Stations

We study a cellular networking scenario, called DroneCells, where miniaturized base stations (BSs) are mounted on flying drones to serve mobile users. We propose that the drones never stop, and move continuously within the cell in a way that reduces the distance between the BS and the serving users, thus potentially improving the spectral efficiency of the network. By considering the practical mobility constraints of commercial drones, we design drone mobility algorithms to improve the spectral efficiency of DroneCells. As the optimal problem is NP-hard, we propose a range of practically realizable heuristics with varying complexity and performance. Simulations show that, using the existing consumer drones, the proposed algorithms can readily improve spectral efficiency by 34\% and the 5-percentile packet throughput by 50\% compared to the scenario, where drones hover over fixed locations. More significant gains can be expected with more agile drones in the future. A surprising outcome is that the drones need to fly only at minimal speeds to achieve these gains, avoiding any negative effect on drone battery lifetime. We further demonstrate that the optimal solution provides only modest improvements over the best heuristic algorithm, which employs Game Theory to make mobility decisions for drone BSs.

preprint2020arXiv

Enhancements of the 3GPP LTE-Advanced and the Prized Asset: Dynamic TDD Transmissions

In this paper, we perform a survey on new Third Generation Partnership Project (3GPP) Long Term Evolution-Advanced (LTE-Advanced) enhancements,covering the technologies recently adopted by the 3GPP in LTE Release 11 and those being discussed in LTE Release 12. In more details, we introduce the latest enhancements on carrier aggregation (CA), multiple-input multiple-output (MIMO) and coordinated multi-point (CoMP) as well as three-dimensional (3D) MIMO. Moreover, considering that network nodes will become very diverse in the future, and thus with heterogeneous network (HetNet) being a key feature of LTE-Advanced networks, we also discuss technologies of interest in HetNet scenarios, e.g., enhanced physical data control channel (ePDCCH), further enhanced inter-cell interference coordination (FeICIC) and small cells, together with energy efficiency concerns. In particular, we pay special attention to one of the most important enhancements in LTE Release 12, i.e., dynamic time division duplex (TDD) transmissions, and present performance results that shed new light on this topic.

preprint2020arXiv

Enhancing Cellular Communications for UAVs via Intelligent Reflective Surface

Intelligent reflective surfaces (IRSs) capable of reconfiguring their electromagnetic absorption and reflection properties in real-time are offering unprecedented opportunities to enhance wireless communication experience in challenging environments. In this paper, we analyze the potential of IRS in enhancing cellular communications for UAVs, which currently suffers from poor signal strength due to the down-tilt of base station antennas optimized to serve ground users. We consider deployment of IRS on building walls, which can be remotely configured by cellular base stations to coherently direct the reflected radio waves towards specific UAVs in order to increase their received signal strengths. Using the recently released 3GPP ground-to-air channel models, we analyze the signal gains at UAVs due to the IRS deployments as a function of UAV height as well as various IRS parameters including size, altitude, and distance from base station. Our analysis suggests that even with a small IRS, we can achieve significant signal gain for UAVs flying above the cellular base station. We also find that the maximum gain can be achieved by optimizing the location of IRS including its altitude and distance to BS.

preprint2020arXiv

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and graph classification. However, prior arts on graph representation learning focus on domain specific problems and train a dedicated model for each graph dataset, which is usually non-transferable to out-of-domain data. Inspired by the recent advances in pre-training from natural language processing and computer vision, we design Graph Contrastive Coding (GCC) -- a self-supervised graph neural network pre-training framework -- to capture the universal network topological properties across multiple networks. We design GCC's pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. We conduct extensive experiments on three graph learning tasks and ten graph datasets. The results show that GCC pre-trained on a collection of diverse datasets can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts. This suggests that the pre-training and fine-tuning paradigm presents great potential for graph representation learning.

preprint2020arXiv

Integration of Blockchain and Cloud of Things: Architecture, Applications and Challenges

The blockchain technology is taking the world by storm. Blockchain with its decentralized, transparent and secure nature has emerged as a disruptive technology for the next generation of numerous industrial applications. One of them is Cloud of Things enabled by the combination of cloud computing and Internet of Things. In this context, blockchain provides innovative solutions to address challenges in Cloud of Things in terms of decentralization, data privacy and network security, while Cloud of Things offer elasticity and scalability functionalities to improve the efficiency of blockchain operations. Therefore, a novel paradigm of blockchain and Cloud of Things integration, called BCoT, has been widely regarded as a promising enabler for a wide range of application scenarios. In this paper, we present a state-of-the-art review on the BCoT integration to provide general readers with an overview of the BCoT in various aspects, including background knowledge, motivation, and integrated architecture. Particularly, we also provide an in-depth survey of BCoT applications in different use-case domains such as smart healthcare, smart city, smart transportation and smart industry. Then, we review the recent BCoT developments with the emerging blockchain and cloud platforms, services, and research projects. Finally, some important research challenges and future directions are highlighted to spur further research in this promising area.

preprint2020arXiv

Leakage-Based Robust Beamforming for Multi-Antenna Broadcast System with Per-Antenna Power Constraints and Quantized CDI

In this paper, we investigate the robust beamforming schemes for a multi-user multiple-input-single-output (MU-MISO) system with per-antenna power constraints and quantized channel direction information (CDI) feedback. Our design objective is to maximize the expectation of the weighted sum-rate performance by means of controlling the interference leakage and properly allocating the power among user equipments (UEs).First, we prove the optimality of the non-robust zero-forcing (ZF) beamforming scheme in the sense of generating the minimum amount of average inter-UE interference under quantized CDI. Then we derive closed-form expressions of the cumulative density function (CDF) of the interference leakage power for the non-robust ZF beamforming scheme, based on which we adjust the leakage thresholds and propose two robust beamforming schemes under per-antenna power constraints with an iterative process to update the per-UE power allocations using the geometric programming (GP). Simulation results show the superiority of the proposed robust beamforming schemes compared with the existing schemes in terms of the average weighted sum-rate performance.

preprint2020arXiv

MMSE Based Greedy Antenna Selection Scheme for AF MIMO Relay Systems

We propose a greedy minimum mean squared error (MMSE)-based antenna selection algorithm for amplify-and-forward (AF) multiple-input multiple-output (MIMO) relay systems. Assuming equal-power allocation across the multi-stream data, we derive a closed form expression for the mean squared error (MSE) resulted from adding each additional antenna pair. Based on this result, we iteratively select the antenna-pairs at the relay nodes to minimize the MSE. Simulation results show that our algorithm greatly outperforms the existing schemes.

preprint2020arXiv

On Dynamic Time Division Duplex Transmissions for Small Cell Networks

Motivated by the promising benefits of dynamic Time Division Duplex (TDD), in this paper, we use a unified framework to investigate both the technical issues of applying dynamic TDD in homogeneous small cell networks (HomSCNs), and the feasibility of introducing dynamic TDD into heterogeneous networks (HetNets). First, HomSCNs are analyzed, and a small cell BS scheduler that dynamically and independently schedules DL and UL subframes is presented, such that load balancing between the DL and the UL traffic can be achieved. Moreover, the effectiveness of various inter-link interference mitigation (ILIM) schemes as well as their combinations, is systematically investigated and compared. Besides, the interesting possibility of partial interference cancellation (IC) is also explored. Second,based on the proposed schemes, the joint operation of dynamic TDD together with cell range expansion (CRE) and almost blank subframe (ABS) in HetNets is studied. In this regard, scheduling polices in small cells and an algorithm to derive the appropriate macrocell traffic off-load and ABS duty cycle under dynamic TDD operation are proposed. Moreover, the full IC and the partial IC schemes are investigated for dynamic TDD in HetNets. The user equipment (UE) packet throughput performance of the proposed/discussed schemes is benchmarked using system-level simulations.

preprint2020arXiv

On Safeguarding Privacy and Security in the Framework of Federated Learning

Motivated by the advancing computational capacity of wireless end-user equipment (UE), as well as the increasing concerns about sharing private data, a new machine learning (ML) paradigm has emerged, namely federated learning (FL). Specifically, FL allows a decoupling of data provision at UEs and ML model aggregation at a central unit. By training model locally, FL is capable of avoiding data leakage from the UEs, thereby preserving privacy and security to some extend. However, even if raw data are not disclosed from UEs, individual's private information can still be extracted by some recently discovered attacks in the FL architecture. In this work, we analyze the privacy and security issues in FL, and raise several challenges on preserving privacy and security when designing FL systems. In addition, we provide extensive simulation results to illustrate the discussed issues and possible solutions.

preprint2020arXiv

Privacy-Preserved Task Offloading in Mobile Blockchain with Deep Reinforcement Learning

Blockchain technology with its secure, transparent and decentralized nature has been recently employed in many mobile applications. However, the mining process in mobile blockchain requires high computational and storage capability of mobile devices, which would hinder blockchain applications in mobile systems. To meet this challenge, we propose a mobile edge computing (MEC) based blockchain network where multi-mobile users (MUs) act as miners to offload their mining tasks to a nearby MEC server via wireless channels. Specially, we formulate task offloading and user privacy preservation as a joint optimization problem which is modelled as a Markov decision process, where our objective is to minimize the long-term system offloading costs and maximize the privacy levels for all blockchain users. We first propose a reinforcement learning (RL)-based offloading scheme which enables MUs to make optimal offloading decisions based on blockchain transaction states and wireless channel qualities between MUs and MEC server. To further improve the offloading performances for larger-scale blockchain scenarios, we then develop a deep RL algorithm by using deep Q-network which can efficiently solve large state space without any prior knowledge of the system dynamics. Simulation results show that the proposed RL-based offloading schemes significantly enhance user privacy, and reduce the energy consumption as well as computation latency with minimum offloading costs in comparison with the benchmark offloading schemes.

preprint2020arXiv

Probabilistic Caching for Small-Cell Networks with Terrestrial and Aerial Users

The support for aerial users has become the focus of recent 3GPP standardizations of 5G, due to their high maneuverability and flexibility for on-demand deployment. In this paper, probabilistic caching is studied for ultra-dense small-cell networks with terrestrial and aerial users, where a dynamic on-off architecture is adopted under a sophisticated path loss model incorporating both line-of-sight and non-line-of-sight transmissions. Generally, this paper focuses on the successful download probability (SDP) of user equipments (UEs) from small-cell base stations (SBSs) that cache the requested files under various caching strategies. To be more specific, the SDP is first analyzed using stochastic geometry theory, by considering the distribution of such two-tier UEs and SBSs as Homogeneous Poisson Point Processes. Second, an optimized caching strategy (OCS) is proposed to maximize the average SDP. Third, the performance limits of the average SDP are developed for the popular caching strategy (PCS) and the uniform caching strategy (UCS). Finally, the impacts of the key parameters, such as the SBS density, the cache size, the exponent of Zipf distribution and the height of aerial user, are investigated on the average SDP. The analytical results indicate that the UCS outperforms the PCS if the SBSs are sufficiently dense, while the PCS is better than the UCS if the exponent of Zipf distribution is large enough. Furthermore, the proposed OCS is superior to both the UCS and PCS.

preprint2020arXiv

RDP-GAN: A Rényi-Differential Privacy based Generative Adversarial Network

Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection. Without directly interactive with training examples, the generative model can be fully used to estimate the underlying distribution of an original dataset while the discriminative model can examine the quality of the generated samples by comparing the label values with the training examples. However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information. To mitigate this information leakage and construct a private GAN, in this work we propose a Rényi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training. Moreover, we derive the analytical results of the total privacy loss under the subsampling method and cumulated iterations, which show its effectiveness on the privacy budget allocation. In addition, in order to mitigate the negative impact brought by the injecting noise, we enhance the proposed algorithm by adding an adaptive noise tuning step, which will change the volume of added noise according to the testing accuracy. Through extensive experimental results, we verify that the proposed algorithm can achieve a better privacy level while producing high-quality samples compared with a benchmark DP-GAN scheme based on noise perturbation on training gradients.

preprint2020arXiv

Sequential and Incremental Precoder Design for Joint Transmission Network MIMO Systems with Imperfect Backhaul

In this paper, we propose a sequential and incremental precoder design for downlink joint transmission (JT) network MIMO systems with imperfect backhaul links. The objective of our design is to minimize the maximum of the sub-stream mean square errors (MSE), which dominates the average bit error rate (BER) performance of the system. In the proposed scheme,we first optimize the precoder at the serving base station (BS), and then sequentially optimize the precoders of non-serving BSs in the JT set according to the descending order of their probabilities of participating in JT. The BS-wise sequential optimization process can improve the system performance when some BSs have to temporarily quit the JT operations because of poor instant backhaul conditions. Besides, the precoder of an additional BS is derived in an incremental way, i.e., the sequentially optimized precoders of previous BSs are fixed, thus the additional precoder plays an incremental part in the multi-BS JT operations. An iterative algorithm is designed to jointly optimize the sub-stream precoder and sub-stream power allocation for each additional BS in the proposed sequential and incremental optimization scheme. Simulations show that, under the practical backhaul link conditions, our scheme significantly outperforms the autonomous global precoding (AGP) scheme in terms of BER performance.

preprint2020arXiv

Skin-MIMO: Vibration-based MIMO Communication over Human Skin

We explore the feasibility of Multiple-Input-Multiple-Output (MIMO) communication through vibrations over human skin. Using off-the-shelf motors and piezo transducers as vibration transmitters and receivers, respectively, we build a 2x2 MIMO testbed to collect and analyze vibration signals from real subjects. Our analysis reveals that there exist multiple independent vibration channels between a pair of transmitter and receiver, confirming the feasibility of MIMO. Unfortunately, the slow ramping of mechanical motors and rapidly changing skin channels make it impractical for conventional channel sounding based channel state information (CSI) acquisition, which is critical for achieving MIMO capacity gains. To solve this problem, we propose Skin-MIMO, a deep learning based CSI acquisition technique to accurately predict CSI entirely based on inertial sensor (accelerometer and gyroscope) measurements at the transmitter, thus obviating the need for channel sounding. Based on experimental vibration data, we show that Skin-MIMO can improve MIMO capacity by a factor of 2.3 compared to Single-Input-Single-Output (SISO) or open-loop MIMO, which do not have access to CSI. A surprising finding is that gyroscope, which measures the angular velocity, is found to be superior in predicting skin vibrations than accelerometer, which measures linear acceleration and used widely in previous research for vibration communications over solid objects.

preprint2020arXiv

Spectrum Intelligent Radio: Technology, Development, and Future Trends

The advent of Industry 4.0 with massive connectivity places significant strains on the current spectrum resources, and challenges the industry and regulators to respond promptly with new disruptive spectrum management strategies. The current radio development, with certain elements of intelligence, is nowhere near showing an agile response to the complex radio environments. Following the line of intelligence, we propose to classify spectrum intelligent radio into three streams: classical signal processing, machine learning (ML), and contextual adaptation. We focus on the ML approach, and propose a new intelligent radio architecture with three hierarchical forms: perception, understanding, and reasoning. The proposed perception method achieves fully blind multi-level spectrum sensing. The understanding method accurately predicts the primary users' coverage across a large area, and the reasoning method performs a near-optimal idle channel selection. Opportunities, challenges, and future visions are also discussed for the realization of a fully intelligent radio.

preprint2020arXiv

Task Offloading for Large-Scale Asynchronous Mobile Edge Computing: An Index Policy Approach

Mobile-edge computing (MEC) offloads computational tasks from wireless devices to network edge, and enables real-time information transmission and computing. Most existing work concerns a small-scale synchronous MEC system. In this paper, we focus on a large-scale asynchronous MEC system with random task arrivals, distinct workloads, and diverse deadlines. We formulate the offloading policy design as a restless multi-armed bandit (RMAB) to maximize the total discounted reward over the time horizon. However, the formulated RMAB is related to a PSPACE-hard sequential decision-making problem, which is intractable. To address this issue, by exploiting the Whittle index (WI) theory, we rigorously establish the WI indexability and derive a scalable closed-form solution. Consequently, in our WI policy, each user only needs to calculate its WI and report it to the BS, and the users with the highest indices are selected for task offloading. Furthermore, when the task completion ratio becomes the focus, the shorter slack time less remaining workload (STLW) priority rule is introduced into the WI policy for performance improvement. When the knowledge of user offloading energy consumption is not available prior to the offloading, we develop Bayesian learning-enabled WI policies, including maximum likelihood estimation, Bayesian learning with conjugate prior, and prior-swapping techniques. Simulation results show that the proposed policies significantly outperform the other existing policies.

preprint2020arXiv

Understanding Negative Sampling in Graph Representation Learning

Graph representation learning has been extensively studied in recent years. Despite its potential in generating continuous embeddings for various networks, both the effectiveness and efficiency to infer high-quality representations toward large corpus of nodes are still challenging. Sampling is a critical point to achieve the performance goals. Prior arts usually focus on sampling positive node pairs, while the strategy for negative sampling is left insufficiently explored. To bridge the gap, we systematically analyze the role of negative sampling from the perspectives of both objective and risk, theoretically demonstrating that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance. To the best of our knowledge, we are the first to derive the theory and quantify that the negative sampling distribution should be positively but sub-linearly correlated to their positive sampling distribution. With the guidance of the theory, we propose MCNS, approximating the positive distribution with self-contrast approximation and accelerating negative sampling by Metropolis-Hastings. We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and personalized recommendation, on a total of 19 experimental settings. These relatively comprehensive experimental results demonstrate its robustness and superiorities.

preprint2019arXiv

Blockchain as a Service for Multi-Access Edge Computing: A Deep Reinforcement Learning Approach

Recently, blockchain has gained momentum in the academic community thanks to its decentralization, immutability, transparency and security. As an emerging paradigm, Multi-access Edge Computing (MEC) has been widely used to provide computation and storage resources to mobile user equipments (UE) at the edge of the network for improving the performance of mobile applications. In this paper, we propose a novel blockchain-based MEC architecture where UEs can offload their computation tasks to the MEC servers. In particular, a blockchain network is deployed and hosted on the MEC platform as Blockchain as a Service (BaaS) that supports smart contract-based resource trading and transaction mining services for mobile task offloading. To enhance the performance of the blockchain-empowered MEC system, we propose a joint scheme of computation offloading and blockchain mining. Accordingly, an optimization problem is formulated to maximize edge service revenue and blockchain mining reward while minimizing the service computation latency with respect to constraints of user service demands and hash power resource. We then propose a novel Deep Reinforcement Learning (DRL) approach using a double deep Q-network (DQN) algorithm to solve the proposed problem. Numerical results demonstrate that the proposed scheme outperforms the other baseline methods in terms of better system utility with computational efficiency. Experiment results also verify that the trading contract design is efficient with low operation cost, showing the feasibility of the proposed scheme.