Source author record

Qiang Yang

Qiang Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

62works

26topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FedHarmony: Harmonizing Heterogeneous Label Correlations in Federated Multi-Label Learning

Federated Multi-Label Learning is a distributed paradigm where multiple clients possess heterogeneous multi-label data and perform collaborative learning under privacy constraints without sharing raw data. However, modeling label correlations under heterogeneous distributions remains challenging. Due to client-specific label spaces and varying co-occurrence patterns, correlations learned by individual clients inevitably deviate from the global structure, a phenomenon we term label correlation drift. To address this, we propose FedHarmony, a framework that harmonizes heterogeneous label correlations across clients. It introduces consensus correlation, capturing agreement among other clients and serving as a global teacher to correct biased local estimates. During aggregation, FedHarmony evaluates each client by both data size and correlation quality, assigning weights accordingly. Moreover, we develop an accelerated optimization algorithm for FedHarmony and theoretically establish faster convergence without sacrificing accuracy. Experiments on real-world federated multi-label datasets show that FedHarmony consistently outperforms state-of-the-art methods.

preprint2023arXiv

A Survey on Evaluation of Large Language Models

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey.

preprint2022arXiv

A Full Dive into Realizing the Edge-enabled Metaverse: Visions, Enabling Technologies,and Challenges

Dubbed "the successor to the mobile Internet", the concept of the Metaverse has grown in popularity. While there exist lite versions of the Metaverse today, they are still far from realizing the full vision of an immersive, embodied, and interoperable Metaverse. Without addressing the issues of implementation from the communication and networking, as well as computation perspectives, the Metaverse is difficult to succeed the Internet, especially in terms of its accessibility to billions of users today. In this survey, we focus on the edge-enabled Metaverse to realize its ultimate vision. We first provide readers with a succinct tutorial of the Metaverse, an introduction to the architecture, as well as current developments. To enable ubiquitous, seamless, and embodied access to the Metaverse, we discuss the communication and networking challenges and survey cutting-edge solutions and concepts that leverage next-generation communication systems for users to immerse as and interact with embodied avatars in the Metaverse. Moreover, given the high computation costs required, e.g., to render 3D virtual worlds and run data-hungry artificial intelligence-driven avatars, we discuss the computation challenges and cloud-edge-end computation framework-driven solutions to realize the Metaverse on resource-constrained edge devices. Next, we explore how blockchain technologies can aid in the interoperable development of the Metaverse, not just in terms of empowering the economic circulation of virtual user-generated content but also to manage physical edge resources in a decentralized, transparent, and immutable manner. Finally, we discuss the future research directions towards realizing the true vision of the edge-enabled Metaverse.

preprint2022arXiv

An Efficient Industrial Federated Learning Framework for AIoT: A Face Recognition Application

Recently, the artificial intelligence of things (AIoT) has been gaining increasing attention, with an intriguing vision of providing highly intelligent services through the network connection of things, leading to an advanced AI-driven ecology. However, recent regulatory restrictions on data privacy preclude uploading sensitive local data to data centers and utilizing them in a centralized approach. Directly applying federated learning algorithms in this scenario could hardly meet the industrial requirements of both efficiency and accuracy. Therefore, we propose an efficient industrial federated learning framework for AIoT in terms of a face recognition application. Specifically, we propose to utilize the concept of transfer learning to speed up federated training on devices and further present a novel design of a private projector that helps protect shared gradients without incurring additional memory consumption or computational cost. Empirical studies on a private Asian face dataset show that our approach can achieve high recognition accuracy in only 20 communication rounds, demonstrating its effectiveness in prediction and its efficiency in training.

preprint2022arXiv

Batch Label Inference and Replacement Attacks in Black-Boxed Vertical Federated Learning

In a vertical federated learning (VFL) scenario where features and model are split into different parties, communications of sample-specific updates are required for correct gradient calculations but can be used to deduce important sample-level label information. An immediate defense strategy is to protect sample-level messages communicated with Homomorphic Encryption (HE), and in this way only the batch-averaged local gradients are exposed to each party (termed black-boxed VFL). In this paper, we first explore the possibility of recovering labels in the vertical federated learning setting with HE-protected communication, and show that private labels can be reconstructed with high accuracy by training a gradient inversion model. Furthermore, we show that label replacement backdoor attacks can be conducted in black-boxed VFL by directly replacing encrypted communicated messages (termed gradient-replacement attack). As it is a common presumption that batch-averaged information is safe to share, batch label inference and replacement attacks are a severe challenge to VFL. To defend against batch label inference attack, we further evaluate several defense strategies, including confusional autoencoder (CoAE), a technique we proposed based on autoencoder and entropy regularization. We demonstrate that label inference and replacement attacks can be successfully blocked by this technique without hurting as much main task accuracy as compared to existing methods.

preprint2022arXiv

Cross-domain Cross-architecture Black-box Attacks on Fine-tuned Models with Transferred Evolutionary Strategies

Fine-tuning can be vulnerable to adversarial attacks. Existing works about black-box attacks on fine-tuned models (BAFT) are limited by strong assumptions. To fill the gap, we propose two novel BAFT settings, cross-domain and cross-domain cross-architecture BAFT, which only assume that (1) the target model for attacking is a fine-tuned model, and (2) the source domain data is known and accessible. To successfully attack fine-tuned models under both settings, we propose to first train an adversarial generator against the source model, which adopts an encoder-decoder architecture and maps a clean input to an adversarial example. Then we search in the low-dimensional latent space produced by the encoder of the adversarial generator. The search is conducted under the guidance of the surrogate gradient obtained from the source model. Experimental results on different domains and different network architectures demonstrate that the proposed attack method can effectively and efficiently attack the fine-tuned models.

preprint2022arXiv

FadMan: Federated Anomaly Detection across Multiple Attributed Networks

Anomaly subgraph detection has been widely used in various applications, ranging from cyber attack in computer networks to malicious activities in social networks. Despite an increasing need for federated anomaly detection across multiple attributed networks, only a limited number of approaches are available for this problem. Federated anomaly detection faces two major challenges. One is that isolated data in most industries are restricted share with others for data privacy and security. The other is most of the centralized approaches training based on data integration. The main idea of federated anomaly detection is aligning private anomalies from local data owners on the public anomalies from the attributed network in the server through public anomalies to federate local anomalies. In each private attributed network, the detected anomaly subgraph is aligned with an anomaly subgraph in the public attributed network. The significant public anomaly subgraphs are selected for federated private anomalies while preventing local private data leakage. The proposed algorithm FadMan is a vertical federated learning framework for public node aligned with many private nodes of different features, and is validated on two tasks correlated anomaly detection on multiple attributed networks and anomaly detection on an attributeless network using five real-world datasets. In the first scenario, FadMan outperforms competitive methods by at least 12% accuracy at 10% noise level. In the second scenario, by analyzing the distribution of abnormal nodes, we find that the nodes of traffic anomalies are associated with the event of postgraduate entrance examination on the same day.

preprint2022arXiv

FedIPR: Ownership Verification for Federated Deep Neural Network Models

Federated learning models are collaboratively developed upon valuable training data owned by multiple parties. During the development and deployment of federated models, they are exposed to risks including illegal copying, re-distribution, misuse and/or free-riding. To address these risks, the ownership verification of federated learning models is a prerequisite that protects federated learning model intellectual property rights (IPR) i.e., FedIPR. We propose a novel federated deep neural network (FedDNN) ownership verification scheme that allows private watermarks to be embedded and verified to claim legitimate IPR of FedDNN models. In the proposed scheme, each client independently verifies the existence of the model watermarks and claims respective ownership of the federated model without disclosing neither private training data nor private watermark information. The effectiveness of embedded watermarks is theoretically justified by the rigorous analysis of conditions under which watermarks can be privately embedded and detected by multiple clients. Moreover, extensive experimental results on computer vision and natural language processing tasks demonstrate that varying bit-length watermarks can be embedded and reliably detected without compromising original model performances. Our watermarking scheme is also resilient to various federated training settings and robust against removal attacks.

preprint2022arXiv

Frustratingly Easy Transferability Estimation

Transferability estimation has been an essential tool in selecting a pre-trained model and the layers in it for transfer learning, to transfer, so as to maximize the performance on a target task and prevent negative transfer. Existing estimation algorithms either require intensive training on target tasks or have difficulties in evaluating the transferability between layers. To this end, we propose a simple, efficient, and effective transferability measure named TransRate. Through a single pass over examples of a target task, TransRate measures the transferability as the mutual information between features of target examples extracted by a pre-trained model and their labels. We overcome the challenge of efficient mutual information estimation by resorting to coding rate that serves as an effective alternative to entropy. From the perspective of feature representation, the resulting TransRate evaluates both completeness (whether features contain sufficient information of a target task) and compactness (whether features of each class are compact enough for good generalization) of pre-trained features. Theoretically, we have analyzed the close connection of TransRate to the performance after transfer learning. Despite its extraordinary simplicity in 10 lines of codes, TransRate performs remarkably well in extensive evaluations on 32 pre-trained models and 16 downstream tasks.

preprint2022arXiv

Multi-core fiber enabled fading noise suppression in ϕ-OFDR based quantitative distributed vibration sensing

Coherent fading has been regarded as a critical issue in phase-sensitive optical frequency domain reflectometry (ϕ-OFDR) based distributed fiber-optic sensing. Here, we report on an approach for fading noise suppression in ϕ-OFDR with multi-core fiber. By exploiting the independent nature of the randomness in the distribution of reflective index in each of the cores, the drastic phase fluctuations due to the fading phenomina can be effectively alleviated by applying weighted vectorial averaging for the Rayleigh backscattering traces from each of the cores with distinct fading distributions. With the consistent linear response with respect to external excitation of interest for each of the cores, demonstration for the propsoed ϕ-OFDR with a commercial seven-core fiber has achieved highly sensitive quantitative distributed vibration sensing with about 2.2 nm length precision and 2 cm sensing resolution along the 500 m fiber, corresponding to a range resolution factor as high as about about 4E-5. Featuring long distance, high sensitivity, high resolution, and fading robustness, this approach has shown promising potentials in various sensing techniques for a wide range of practical scenarios.

preprint2022arXiv

No Free Lunch Theorem for Security and Utility in Federated Learning

In a federated learning scenario where multiple parties jointly learn a model from their respective data, there exist two conflicting goals for the choice of appropriate algorithms. On one hand, private and sensitive training data must be kept secure as much as possible in the presence of \textit{semi-honest} partners, while on the other hand, a certain amount of information has to be exchanged among different parties for the sake of learning utility. Such a challenge calls for the privacy-preserving federated learning solution, which maximizes the utility of the learned model and maintains a provable privacy guarantee of participating parties' private data. This article illustrates a general framework that a) formulates the trade-off between privacy loss and utility loss from a unified information-theoretic point of view, and b) delineates quantitative bounds of privacy-utility trade-off when different protection mechanisms including Randomization, Sparsity, and Homomorphic Encryption are used. It was shown that in general \textit{there is no free lunch for the privacy-utility trade-off} and one has to trade the preserving of privacy with a certain degree of degraded utility. The quantitative analysis illustrated in this article may serve as the guidance for the design of practical federated learning algorithms.

preprint2022arXiv

Practical and Secure Federated Recommendation with Personalized Masks

Federated recommendation addresses the data silo and privacy problems altogether for recommender systems. Current federated recommender systems mainly utilize cryptographic or obfuscation methods to protect the original ratings from leakage. However, the former comes with extra communication and computation costs, and the latter damages model accuracy. Neither of them could simultaneously satisfy the real-time feedback and accurate personalization requirements of recommender systems. In this paper, we proposed federated masked matrix factorization (FedMMF) to protect the data privacy in federated recommender systems without sacrificing efficiency and effectiveness. In more details, we introduce the new idea of personalized mask generated only from local data and apply it in FedMMF. On the one hand, personalized mask offers protection for participants' private data without effectiveness loss. On the other hand, combined with the adaptive secure aggregation protocol, personalized mask could further improve efficiency. Theoretically, we provide security analysis for personalized mask. Empirically, we also show the superiority of the designed model on different real-world data sets.

preprint2022arXiv

Practical Lossless Federated Singular Vector Decomposition over Billion-Scale Data

With the enactment of privacy-preserving regulations, e.g., GDPR, federated SVD is proposed to enable SVD-based applications over different data sources without revealing the original data. However, many SVD-based applications cannot be well supported by existing federated SVD solutions. The crux is that these solutions, adopting either differential privacy (DP) or homomorphic encryption (HE), suffer from accuracy loss caused by unremovable noise or degraded efficiency due to inflated data. In this paper, we propose FedSVD, a practical lossless federated SVD method over billion-scale data, which can simultaneously achieve lossless accuracy and high efficiency. At the heart of FedSVD is a lossless matrix masking scheme delicately designed for SVD: 1) While adopting the masks to protect private data, FedSVD completely removes them from the final results of SVD to achieve lossless accuracy; and 2) As the masks do not inflate the data, FedSVD avoids extra computation and communication overhead during the factorization to maintain high efficiency. Experiments with real-world datasets show that FedSVD is over 10000 times faster than the HE-based method and has 10 orders of magnitude smaller error than the DP-based solution on SVD tasks. We further build and evaluate FedSVD over three real-world applications: principal components analysis (PCA), linear regression (LR), and latent semantic analysis (LSA), to show its superior performance in practice. On federated LR tasks, compared with two state-of-the-art solutions: FATE and SecureML, FedSVD-LR is 100 times faster than SecureML and 10 times faster than FATE.

preprint2022arXiv

Privacy and Robustness in Federated Learning: Attacks and Defenses

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

preprint2022arXiv

Realizing the Metaverse with Edge Intelligence: A Match Made in Heaven

Dubbed "the successor to the mobile Internet", the concept of the Metaverse has recently exploded in popularity. While there exists lite versions of the Metaverse today, we are still far from realizing the vision of a seamless, shardless, and interoperable Metaverse given the stringent sensing, communication, and computation requirements. Moreover, the birth of the Metaverse comes amid growing privacy concerns among users. In this article, we begin by providing a preliminary definition of the Metaverse. We discuss the architecture of the Metaverse and mainly focus on motivating the convergence of edge intelligence and the infrastructure layer of the Metaverse. We present major edge-based technological developments and their integration to support the Metaverse engine. Then, we present our research attempts through a case study of virtual city development in the Metaverse. Finally, we discuss the open research issues.

preprint2022arXiv

Towards Efficient Synchronous Federated Training: A Survey on System Optimization Strategies

The increasing demand for privacy-preserving collaborative learning has given rise to a new computing paradigm called federated learning (FL), in which clients collaboratively train a machine learning (ML) model without revealing their private training data. Given an acceptable level of privacy guarantee, the goal of FL is to minimize the time-to-accuracy of model training. Compared with distributed ML in data centers, there are four distinct challenges to achieving short time-to-accuracy in FL training, namely the lack of information for optimization, the tradeoff between statistical and system utility, client heterogeneity, and large configuration space. In this paper, we survey recent works in addressing these challenges and present them following a typical training workflow through three phases: client selection, configuration, and reporting. We also review system works including measurement studies and benchmarking tools that aim to support FL developers.

preprint2022arXiv

Towards Personalized Federated Learning

In parallel with the rapid adoption of Artificial Intelligence (AI) empowered by advances in AI research, there have been growing awareness and concerns of data privacy. Recent significant developments in the data regulation landscape have prompted a seismic shift in interest towards privacy-preserving AI. This has contributed to the popularity of Federated Learning (FL), the leading paradigm for the training of machine learning models on data silos in a privacy-preserving manner. In this survey, we explore the domain of Personalized FL (PFL) to address the fundamental challenges of FL on heterogeneous data, a universal characteristic inherent in all real-world datasets. We analyze the key motivations for PFL and present a unique taxonomy of PFL techniques categorized according to the key challenges and personalization strategies in PFL. We highlight their key ideas, challenges and opportunities and envision promising future trajectories of research towards new PFL architectural design, realistic PFL benchmarking, and trustworthy PFL approaches.

preprint2022arXiv

WrapperFL: A Model Agnostic Plug-in for Industrial Federated Learning

Federated learning, as a privacy-preserving collaborative machine learning paradigm, has been gaining more and more attention in the industry. With the huge rise in demand, there have been many federated learning platforms that allow federated participants to set up and build a federated model from scratch. However, exiting platforms are highly intrusive, complicated, and hard to integrate with built machine learning models. For many real-world businesses that already have mature serving models, existing federated learning platforms have high entry barriers and development costs. This paper presents a simple yet practical federated learning plug-in inspired by ensemble learning, dubbed WrapperFL, allowing participants to build/join a federated system with existing models at minimal costs. The WrapperFL works in a plug-and-play way by simply attaching to the input and output interfaces of an existing model, without the need of re-development, significantly reducing the overhead of manpower and resources. We verify our proposed method on diverse tasks under heterogeneous data distributions and heterogeneous models. The experimental results demonstrate that WrapperFL can be successfully applied to a wide range of applications under practical settings and improves the local model with federated learning at a low cost.

preprint2021arXiv

A Game-theoretic Approach Towards Collaborative Coded Computation Offloading

Coded distributed computing (CDC) has emerged as a promising approach because it enables computation tasks to be carried out in a distributed manner while mitigating straggler effects, which often account for the long overall completion times. Specifically, by using polynomial codes, computed results from only a subset of edge servers can be used to reconstruct the final result. However, incentive issues have not been studied systematically for the edge servers to complete the CDC tasks. In this paper, we propose a tractable two-level game-theoretic approach to incentivize the edge servers to complete the CDC tasks. Specifically, in the lower level, a hedonic coalition formation game is formulated where the edge servers share their resources within their coalitions. By forming coalitions, the edge servers have more Central Processing Unit (CPU) power to complete the computation tasks. In the upper level, given the CPU power of the coalitions of edge servers, an all-pay auction is designed to incentivize the edge servers to participate in the CDC tasks. In the all-pay auction, the bids of the edge servers are represented by the allocation of their CPU power to the CDC tasks. The all-pay auction is designed to maximize the utility of the cloud server by determining the allocation of rewards to the winners. Simulation results show that the edge servers are incentivized to allocate more CPU power when multiple rewards are offered, i.e., there are multiple winners, instead of rewarding only the edge server with the largest CPU power allocation. Besides, the utility of the cloud server is maximized when it offers multiple homogeneous rewards, instead of heterogeneous rewards.

preprint2021arXiv

Advances and Open Problems in Federated Learning

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

preprint2021arXiv

Generalised quasilinear approximations of turbulent channel flow: Part 1. Streamwise nonlinear energy transfer

A generalised quasilinear (GQL) approximation (Marston \emph{et al.}, \emph{Phys. Rev. Lett.}, vol. 116, 104502, 2016) is applied to turbulent channel flow at $Re_τ\simeq 1700$ ($Re_τ$ is the friction Reynolds number), with emphasis on the energy transfer in the streamwise wavenumber space. The flow is decomposed into low and high streamwise wavenumber groups, the former of which is solved by considering the full nonlinear equations whereas the latter is obtained from the linearised equations around the former. The performance of the GQL approximation is subsequently compared with that of a QL model (Thomas \emph{et al.}, \emph{Phys. Fluids.}, vol. 26, no. 10, 105112, 2014), in which the low-wavenumber group only contains zero streamwise wavenumber. It is found that the QL model exhibits a considerably reduced multi-scale behaviour at the given moderately high Reynolds number. This is improved significantly by the GQL approximation which incorporates only a few more streamwise Fourier modes into the low-wavenumber group, and it reasonably well recovers the distance-from-the-wall scaling in the turbulence statistics and spectra. Finally, it is proposed that the energy transfer from the low to the high-wavenumber group in the GQL approximation, referred to as the `scattering' mechanism, depends on the neutrally stable leading Lyapunov spectrum of the linearised equations for the high wavenumber group. In particular, it is shown that if the threshold wavenumber distinguishing the two groups is sufficiently high, the scattering mechanism can completely be absent due to the linear nature of the equations for the high-wavenumber group.

preprint2021arXiv

Generalised quasilinear approximations of turbulent channel flow: Part 2. Spanwise scale interactions

Continuing from Part 1 (Hernández \emph{et al.}, \emph{arXiv:2108.12395}, 2021), a generalized quasilinear (GQL) approximation is studied in turbulent channel flow using a flow decomposition defined with spanwise Fourier modes: the flow is decomposed into a set of low-wavenumber spanwise Fourier modes and the rest high-wavenumber modes. This decomposition leads to the nonlinear low-wavenumber group that supports the self-sustaining process within the given integral length scales, whereas the linearised high-wavenumber group is not able to do so, unlike the GQL models in Part 1 which place a minimal mathematical description for the self-sustaining process across all integral scales. Despite the important physical difference, it is shown that the GQL models in this study share some similarities with those in Part 1: i.e. the reduced multi-scale behaviour and anisotropic turbulent fluctuations. Furthermore, despite not being able to support the self-sustaining process in the high-wavenumber group, the GQL models in the present study are found to reproduce some key statistical features in the high-wavenumber group solely through the `scattering' mechanism proposed by previous studies. Finally, using the nature of the GQL approximation, a set of numerical experiments suppressing certain triadic nonlinear interactions are further carried out. This unveils some key roles played by the certain types of triadic interactions including energy cascade and inverse energy transfer in the near-wall region. In particular, the inhibition of inverse energy transfer in the spanwise direction leads to suppression of the near-wall positive turbulent transport at large scales.

preprint2021arXiv

Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attack

Ever since Machine Learning as a Service (MLaaS) emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. To the best of our knowledge, one of the prominent deep learning models - Generative Adversarial Networks (GANs) which has been widely used to create photorealistic image are totally unprotected despite the existence of pioneering IPR protection methodology for Convolutional Neural Networks (CNNs). This paper therefore presents a complete protection framework in both black-box and white-box settings to enforce IPR protection on GANs. Empirically, we show that the proposed method does not compromise the original GANs performance (i.e. image generation, image super-resolution, style transfer), and at the same time, it is able to withstand both removal and ambiguity attacks against embedded watermarks.

preprint2021arXiv

Real-World Image Datasets for Federated Learning

Federated learning is a new machine learning paradigm which allows data parties to build machine learning models collaboratively while keeping their data secure and private. While research efforts on federated learning have been growing tremendously in the past two years, most existing works still depend on pre-existing public datasets and artificial partitions to simulate data federations due to the lack of high-quality labeled data generated from real-world edge applications. Consequently, advances on benchmark and model evaluations for federated learning have been lagging behind. In this paper, we introduce a real-world image dataset. The dataset contains more than 900 images generated from 26 street cameras and 7 object categories annotated with detailed bounding box. The data distribution is non-IID and unbalanced, reflecting the characteristic real-world federated learning scenarios. Based on this dataset, we implemented two mainstream object detection algorithms (YOLO and Faster R-CNN) and provided an extensive benchmark on model performance, efficiency, and communication in a federated learning setting. Both the dataset and algorithms are made publicly available.

preprint2021arXiv

TrNews: Heterogeneous User-Interest Transfer Learning for News Recommendation

We investigate how to solve the cross-corpus news recommendation for unseen users in the future. This is a problem where traditional content-based recommendation techniques often fail. Luckily, in real-world recommendation services, some publisher (e.g., Daily news) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher (e.g., Political news). To take advantage of the existing corpus, we propose a transfer learning model (dubbed as TrNews) for news recommendation to transfer the knowledge from a source corpus to a target corpus. To tackle the heterogeneity of different user interests and of different word distributions across corpora, we design a translator-based transfer-learning strategy to learn a representation mapping between source and target corpora. The learned translator can be used to generate representations for unseen users in the future. We show through experiments on real-world datasets that TrNews is better than various baselines in terms of four metrics. We also show that our translator is effective among existing transfer strategies.

Qiang Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

62 published item(s)

FedHarmony: Harmonizing Heterogeneous Label Correlations in Federated Multi-Label Learning

A Survey on Evaluation of Large Language Models

A Full Dive into Realizing the Edge-enabled Metaverse: Visions, Enabling Technologies,and Challenges

An Efficient Industrial Federated Learning Framework for AIoT: A Face Recognition Application

Batch Label Inference and Replacement Attacks in Black-Boxed Vertical Federated Learning

Cross-domain Cross-architecture Black-box Attacks on Fine-tuned Models with Transferred Evolutionary Strategies

FadMan: Federated Anomaly Detection across Multiple Attributed Networks

FedIPR: Ownership Verification for Federated Deep Neural Network Models

Frustratingly Easy Transferability Estimation

Multi-core fiber enabled fading noise suppression in ϕ-OFDR based quantitative distributed vibration sensing

No Free Lunch Theorem for Security and Utility in Federated Learning

Practical and Secure Federated Recommendation with Personalized Masks

Practical Lossless Federated Singular Vector Decomposition over Billion-Scale Data

Privacy and Robustness in Federated Learning: Attacks and Defenses

Realizing the Metaverse with Edge Intelligence: A Match Made in Heaven

Towards Efficient Synchronous Federated Training: A Survey on System Optimization Strategies

Towards Personalized Federated Learning

WrapperFL: A Model Agnostic Plug-in for Industrial Federated Learning

A Game-theoretic Approach Towards Collaborative Coded Computation Offloading

Advances and Open Problems in Federated Learning

Generalised quasilinear approximations of turbulent channel flow: Part 1. Streamwise nonlinear energy transfer

Generalised quasilinear approximations of turbulent channel flow: Part 2. Spanwise scale interactions

Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attack

Real-World Image Datasets for Federated Learning

TrNews: Heterogeneous User-Interest Transfer Learning for News Recommendation

$WWγ$ production at hadron colliders with NLO QCD+EW corrections and parton shower effects

A Communication Efficient Collaborative Learning Framework for Distributed Features

Federated Deep Reinforcement Learning

Federated Learning in Mobile Edge Networks: A Comprehensive Survey

FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

Fisher Deep Domain Adaptation

Giant photonic spin Hall effect near the Dirac points

HHHFL: Hierarchical Heterogeneous Horizontal Federated Learning for Electroencephalography

Incentive Mechanism Design for Resource Sharing in Collaborative Edge Learning

Mechanism Design for Multi-Party Machine Learning

Network On Network for Tabular Data Classification in Real-world Applications

Privacy Threats Against Federated Matrix Factorization

PrivNet: Safeguarding Private Attributes in Transfer Learning for Recommendation

Probing the $L_μ-L_τ$ gauge boson at electron colliders

Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

RPN: A Residual Pooling Network for Efficient Federated Learning

Secure Federated Transfer Learning

SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic

Threats to Federated Learning: A Survey

Towards Utilizing Unlabeled Data in Federated Learning: A Survey and Prospective

Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning

$Z_H \rightarrow H^0 γ$ decay within the littlest Higgs model at $\mathcal{O}(α_{\rm ew}^{3}α_s)$ accuracy

Secure Federated Matrix Factorization

Smart City Development with Urban Transfer Learning

Partially Observable Markov Decision Process for Recommender Systems

The Lifecycle and Cascade of WeChat Social Messaging Groups

Transitive Hashing Network for Heterogeneous Multimedia Retrieval

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

BigDataBench: a Big Data Benchmark Suite from Internet Services

Characterizing and Subsetting Big Data Workloads

Collaborative Receptive Field Learning

Friendship Prediction in Composite Social Networks

Learning Mid-Level Features and Modeling Neuron Selectivity for Image Classification

MGSim - Simulation tools for multi-core processor architectures

Inferring Taxi Status Using GPS Trajectories

Selective Transfer Learning for Cross Domain Recommendation

BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies