Researcher profile

Huaiyu Dai

Huaiyu Dai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs

Intent-obfuscation-based jailbreak attacks on multimodal large language models (MLLMs) transform a harmful query into a concealed multimodal input to bypass safety mechanisms. We show that such attacks are governed by a \emph{reconstruction--concealment tradeoff}: the transformed input must hide harmful intent from safety filters while remaining recoverable enough for the victim model to reconstruct the original request. Through a reconstruction analysis of three representative black-box methods, we find that existing transformations struggle to balance this tradeoff, limiting their effectiveness. In contrast, we show that character-removed variants achieve a better balance. Building on this, we propose \emph{concealment-aware variant construction}, which greedily selects character-removed variants that are low in harmful-keyword alignment and mutually diverse, and instantiates them through five modality-aware prompting strategies. We further introduce \emph{keyword-related distractor images} that depict the harmful keyword in diverse contexts, providing more effective auxiliary visual context than generic distractor images. Experiments across closed-source and open-source MLLMs show the proposed strategies outperform strong baselines, revealing an underexplored vulnerability: a model's own reconstruction ability can be exploited to recover hidden harmful intent and produce unsafe responses.

preprint2022arXiv

Communication-Efficient Federated Learning via Predictive Coding

Federated learning can enable remote workers to collaboratively train a shared machine learning model while allowing training data to be kept locally. In the use case of wireless mobile devices, the communication overhead is a critical bottleneck due to limited power and bandwidth. Prior work has utilized various data compression tools such as quantization and sparsification to reduce the overhead. In this paper, we propose a predictive coding based compression scheme for federated learning. The scheme has shared prediction functions among all devices and allows each worker to transmit a compressed residual vector derived from the reference. In each communication round, we select the predictor and quantizer based on the rate-distortion cost, and further reduce the redundancy with entropy coding. Extensive simulations reveal that the communication cost can be reduced up to 99% with even better learning performance when compared with other baseline methods.

preprint2022arXiv

FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning Using Shared Data on the Server

Despite achieving remarkable performance, Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency. In this paper, we propose a novel FL framework, i.e., FedDUAP, with two original contributions, to exploit the insensitive data on the server and the decentralized data in edge devices to further improve the training efficiency. First, a dynamic server update algorithm is designed to exploit the insensitive data on the server, in order to dynamically determine the optimal steps of the server update for improving the convergence and accuracy of the global model. Second, a layer-adaptive model pruning method is developed to perform unique pruning operations adapted to the different dimensions and importance of multiple layers, to achieve a good balance between efficiency and effectiveness. By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller).

preprint2022arXiv

Mobility, Communication and Computation Aware Federated Learning for Internet of Vehicles

While privacy concerns entice connected and automated vehicles to incorporate on-board federated learning (FL) solutions, an integrated vehicle-to-everything communication with heterogeneous computation power aware learning platform is urgently necessary to make it a reality. Motivated by this, we propose a novel mobility, communication and computation aware online FL platform that uses on-road vehicles as learning agents. Thanks to the advanced features of modern vehicles, the on-board sensors can collect data as vehicles travel along their trajectories, while the on-board processors can train machine learning models using the collected data. To take the high mobility of vehicles into account, we consider the delay as a learning parameter and restrict it to be less than a tolerable threshold. To satisfy this threshold, the central server accepts partially trained models, the distributed roadside units (a) perform downlink multicast beamforming to minimize global model distribution delay and (b) allocate optimal uplink radio resources to minimize local model offloading delay, and the vehicle agents conduct heterogeneous local model training. Using real-world vehicle trace datasets, we validate our FL solutions. Simulation shows that the proposed integrated FL platform is robust and outperforms baseline models. With reasonable local training episodes, it can effectively satisfy all constraints and deliver near ground truth multi-horizon velocity and vehicle-specific power predictions.

preprint2022arXiv

Multi-Stage Hybrid Federated Learning over Large-Scale D2D-Enabled Fog Networks

Federated learning has generated significant interest, with nearly all works focused on a "star" topology where nodes/devices are each connected to a central server. We migrate away from this architecture and extend it through the network dimension to the case where there are multiple layers of nodes between the end devices and the server. Specifically, we develop multi-stage hybrid federated learning (MH-FL), a hybrid of intra- and inter-layer model learning that considers the network as a multi-layer cluster-based structure. MH-FL considers the topology structures among the nodes in the clusters, including local networks formed via device-to-device (D2D) communications, and presumes a semi-decentralized architecture for federated learning. It orchestrates the devices at different network layers in a collaborative/cooperative manner (i.e., using D2D interactions) to form local consensus on the model parameters and combines it with multi-stage parameter relaying between layers of the tree-shaped hierarchy. We derive the upper bound of convergence for MH-FL with respect to parameters of the network topology (e.g., the spectral radius) and the learning algorithm (e.g., the number of D2D rounds in different clusters). We obtain a set of policies for the D2D rounds at different clusters to guarantee either a finite optimality gap or convergence to the global optimum. We then develop a distributed control algorithm for MH-FL to tune the D2D rounds in each cluster over time to meet specific convergence criteria. Our experiments on real-world datasets verify our analytical results and demonstrate the advantages of MH-FL in terms of resource utilization metrics.

preprint2022arXiv

Neural Tangent Kernel Empowered Federated Learning

Federated learning (FL) is a privacy-preserving paradigm where multiple participants jointly solve a machine learning problem without sharing raw data. Unlike traditional distributed learning, a unique characteristic of FL is statistical heterogeneity, namely, data distributions across participants are different from each other. Meanwhile, recent advances in the interpretation of neural networks have seen a wide use of neural tangent kernels (NTKs) for convergence analyses. In this paper, we propose a novel FL paradigm empowered by the NTK framework. The paradigm addresses the challenge of statistical heterogeneity by transmitting update data that are more expressive than those of the conventional FL paradigms. Specifically, sample-wise Jacobian matrices, rather than model weights/gradients, are uploaded by participants. The server then constructs an empirical kernel matrix to update a global model without explicitly performing gradient descent. We further develop a variant with improved communication efficiency and enhanced privacy. Numerical results show that the proposed paradigm can achieve the same accuracy while reducing the number of communication rounds by an order of magnitude compared to federated averaging.

preprint2022arXiv

RFID: Towards Low Latency and Reliable DAG Task Scheduling over Dynamic Vehicular Clouds

Vehicular cloud (VC) platforms integrate heterogeneous and distributed resources of moving vehicles to offer timely and cost-effective computing services. However, the dynamic nature of VCs (i.e., limited contact duration among vehicles), caused by vehicles' mobility, poses unique challenges to the execution of computation-intensive applications/tasks with directed acyclic graph (DAG) structure, where each task consists of multiple interdependent components (subtasks). In this paper, we study scheduling of DAG tasks over dynamic VCs, where multiple subtasks of a DAG task are dispersed across vehicles and then processed by cooperatively utilizing vehicles' resources. We formulate DAG task scheduling as a 0-1 integer programming, aiming to minimize the overall task completion time, while ensuring a high execution success rate, which turns out to be NP-hard. To tackle the problem, we develop a ranking and foresight-integrated dynamic scheduling scheme (RFID). RFID consists of (i) a dynamic downward ranking mechanism that sorts the scheduling priority of different subtasks, while explicitly taking into account for the sequential execution nature of DAG; (ii) a resource scarcity-based priority changing mechanism that overcomes possible performance degradations caused by the volatility of VC resources; and (iii) a degree-based weighted earliest finish time mechanism that assigns the subtask with the highest scheduling priority to the vehicle which offers rapid task execution along with reliable transmission links. Our simulation results reveal the effectiveness of our proposed scheme in comparison to benchmark methods.

preprint2021arXiv

On the Privacy Guarantees of Gossip Protocols in General Networks

Recently, the privacy guarantees of information dissemination protocols have attracted increasing research interests, among which the gossip protocols assume vital importance in various information exchange applications. In this work, we study the privacy guarantees of gossip protocols in general networks in terms of differential privacy and prediction uncertainty. First, lower bounds of the differential privacy guarantees are derived for gossip protocols in general networks in both synchronous and asynchronous settings. The prediction uncertainty of the source node given a uniform prior is also determined. For the private gossip algorithm, the differential privacy and prediction uncertainty guarantees are derived in closed form. Moreover, considering that these two metrics may be restrictive in some scenarios, the relaxed variants are proposed. It is found that source anonymity is closely related to some key network structure parameters in the general network setting. Then, we investigate information spreading in wireless networks with unreliable communications, and quantify the tradeoff between differential privacy guarantees and information spreading efficiency. Finally, considering that the attacker may not be present at the beginning of the information dissemination process, the scenario of delayed monitoring is studied and the corresponding differential privacy guarantees are evaluated.

preprint2021arXiv

Spectral Graph Theory Based Resource Allocation for IRS-Assisted Multi-Hop Edge Computing

The performance of mobile edge computing (MEC) depends critically on the quality of the wireless channels. From this viewpoint, the recently advocated intelligent reflecting surface (IRS) technique that can proactively reconfigure wireless channels is anticipated to bring unprecedented performance gain to MEC. In this paper, the problem of network throughput optimization of an IRS-assisted multi-hop MEC network is investigated, in which the phase-shifts of the IRS and the resource allocation of the relays need to be jointly optimized. However, due to the coupling among the transmission links of different hops caused by the utilization of the IRS and the complicated multi-hop network topology, it is difficult to solve the considered problem by directly applying existing optimization techniques. Fortunately, by exploiting the underlying structure of the network topology and spectral graph theory, it is shown that the network throughput can be well approximated by the second smallest eigenvalue of the network Laplacian matrix. This key finding allows us to develop an effective iterative algorithm for solving the considered problem. Numerical simulations are performed to corroborate the effectiveness of the proposed scheme.

preprint2020arXiv

A Truthful Auction for Graph Job Allocation in Vehicular Cloud-assisted Networks

Vehicular cloud computing has emerged as a promising solution to fulfill users' demands on processing computation-intensive applications in modern driving environments. Such applications are commonly represented by graphs consisting of components and edges. However, encouraging vehicles to share resources poses significant challenges owing to users' selfishness. In this paper, an auction-based graph job allocation problem is studied in vehicular cloud-assisted networks considering resource reutilization. Our goal is to map each buyer (component) to a feasible seller (virtual machine) while maximizing the buyers' utility-of-service, which concerns the execution time and commission cost. First, we formulate the auction-based graph job allocation as an integer programming (IP) problem. Then, a Vickrey-Clarke-Groves based payment rule is proposed which satisfies the desired economical properties, truthfulness and individual rationality. We face two challenges: 1) the above-mentioned IP problem is NP-hard; 2) one constraint associated with the IP problem poses addressing the subgraph isomorphism problem. Thus, obtaining the optimal solution is practically infeasible in large-scale networks. Motivated by which, we develop a structure-preserved matching algorithm by maximizing the utility-of-service-gain, and the corresponding payment rule which offers economical properties and low computation complexity. Extensive simulations demonstrate that the proposed algorithm outperforms the benchmark methods considering various problem sizes.

preprint2020arXiv

Energy-aware Allocation of Graph Jobs in Vehicular Cloud Computing-enabled Software-defined IoV

Software-defined internet of vehicles (SDIoV) has emerged as a promising paradigm to realize flexible and comprehensive resource management, for next generation automobile transportation systems. In this paper, a vehicular cloud computing-based SDIoV framework is studied wherein the joint allocation of transmission power and graph job is formulated as a nonlinear integer programming problem. To effectively address the problem, a structure-preservation-based two-stage allocation scheme is proposed that decouples template searching from power allocation. Specifically, a hierarchical tree-based random subgraph isomorphism mechanism is applied in the first stage by identifying potential mappings (templates) between the components of graph jobs and service providers. A structure-preserving simulated annealing-based power allocation algorithm is adopted in the second stage to achieve the trade-off between the job completion time and energy consumption. Extensive simulations are conducted to verify the performance of the proposed algorithms.

preprint2020arXiv

GeoDA: a geometric framework for black-box adversarial attacks

Adversarial examples are known as carefully perturbed images fooling image classifiers. We propose a geometric framework to generate adversarial examples in one of the most challenging black-box settings where the adversary can only generate a small number of queries, each of them returning the top-$1$ label of the classifier. Our framework is based on the observation that the decision boundary of deep networks usually has a small mean curvature in the vicinity of data samples. We propose an effective iterative algorithm to generate query-efficient black-box perturbations with small $\ell_p$ norms for $p \ge 1$, which is confirmed via experimental evaluations on state-of-the-art natural image classifiers. Moreover, for $p=2$, we theoretically show that our algorithm actually converges to the minimal $\ell_2$-perturbation when the curvature of the decision boundary is bounded. We also obtain the optimal distribution of the queries over the iterations of the algorithm. Finally, experimental results confirm that our principled black-box attack algorithm performs better than state-of-the-art algorithms as it generates smaller perturbations with a reduced number of queries.

preprint2020arXiv

Interference Analysis and Mitigation for Massive Access Aerial IoT Considering 3D Antenna Patterns

Due to dense deployments of Internet of things (IoT) networks, interference management becomes a critical challenge. With the proliferation of aerial IoT devices, such as unmanned aerial vehicles (UAVs), interference characteristics in 3D environments will be different than those in the existing terrestrial IoT networks. In this paper, we consider 3D topology IoT networks with a mixture of aerial and terrestrial links, with low-cost cross-dipole antennas at ground nodes and omni-directional antennas at aerial nodes. Considering a massive-access communication scenario, we first derive the statistics of the channel gain at IoT receivers in closed form while taking into account the radiation patterns of both ground and aerial nodes. These are then used to calculate the ergodic achievable rate as a function of the height of the aerial receiver. We propose an interference mitigation scheme that utilizes 3D antenna radiation pattern with different dipole antenna settings. Our results show that using the proposed scheme, the ergodic achievable rate improves as the height of aerial receivers increases. In addition, the ratio between the ground and aerial receivers that maximizes the peak rate also increases with the aerial IoT receiver height.

preprint2020arXiv

Interference Avoidance Position Planning in Dual-hop and Multi-hop UAV Relay Networks

We consider unmanned aerial vehicle (UAV)-assisted wireless communication employing UAVs as relay nodes to increase the throughput between a pair of transmitter and receiver. We focus on developing effective methods to position the UAV(s) in the sky in the presence of interference in the environment, the existence of which makes the problem non-trivial and our methodology different from the current art. We study the optimal position planning, which aims to maximize the (average) signal-to-interference-ratio (SIR) of the system, in the presence of: i) one major source of interference, ii) stochastic interference. For each scenario, we first consider utilizing a single UAV in the dual-hop relay mode and determine its optimal position. Afterward, multiple UAVs in the multi-hop relay mode are considered, for which we investigate two novel problems concerned with determining the optimal number of required UAVs and developing an optimal fully distributed position alignment method. Subsequently, we propose a cost-effective method that simultaneously minimizes the number of UAVs and determines their optimal positions to guarantee a certain (average) SIR of the system. Alternatively, for a given number of UAVs, we develop a fully distributed placement algorithm along with its performance guarantee. Numerical simulations are provided to evaluate the performance of our proposed methods.

preprint2020arXiv

Lifetime Maximization for UAV-assisted Data Gathering Networks in the Presence of Jamming

Deployment of unmanned aerial vehicles (UAVs) is recently getting significant attention due to a variety of practical use cases, such as surveillance, data gathering, and commodity delivery. Since UAVs are powered by batteries, energy efficient communication is of paramount importance. In this paper, we investigate the problem of lifetime maximization of a UAV-assisted network in the presence of multiple sources of interference, where the UAVs are deployed to collect data from a set of wireless sensors. We demonstrate that the placement of the UAVs play a key role in prolonging the lifetime of the network since the required transmission powers of the UAVs are closely related to their locations in space. In the proposed scenario, the UAVs transmit the gathered data to a primary UAV called \textit{leader}, which is in charge of forwarding the data to the base station (BS) via a backhaul UAV network. We deploy tools from spectral graph theory to tackle the problem due to its high non-convexity. Simulation results demonstrate that our proposed method can significantly improve the lifetime of the UAV network.

preprint2020arXiv

Optimal Jammer Placement in UAV-assisted Relay Networks

We consider the relaying application of unmanned aerial vehicles (UAVs), in which UAVs are placed between two transceivers (TRs) to increase the throughput of the system. Instead of studying the placement of UAVs as pursued in existing literature, we focus on investigating the placement of a jammer or a major source of interference on the ground to effectively degrade the performance of the system, which is measured by the maximum achievable data rate of transmission between the TRs. We demonstrate that the optimal placement of the jammer is in general a non-convex optimization problem, for which obtaining the solution directly is intractable. Afterward, using the inherent characteristics of the signal-to-interference ratio (SIR) expressions, we propose a tractable approach to find the optimal position of the jammer. Based on the proposed approach, we investigate the optimal positioning of the jammer in both dual-hop and multi-hop UAV relaying settings. Numerical simulations are provided to evaluate the performance of our proposed method.

preprint2020arXiv

Precoder Design for mmWave UAV Communications with Physical Layer Security

The integration of unmanned aerial vehicles (UAVs) into the terrestrial cellular networks is envisioned as one key technology for next-generation wireless communications. In this work, we consider the physical layer security of the communications links in the millimeter-wave (mmWave) spectrum which are maintained by UAVs functioning as base stations (BS). In particular, we propose a new precoding strategy which incorporates the channel state information (CSI) of the eavesdropper (Eve) compromising link security. We show that our proposed precoder strategy eliminates any need for artificial noise (AN) transmission in underloaded scenarios (fewer users than number of antennas). In addition, we demonstrate that our nonlinear precoding scheme provides promising secrecy-rate performance even for overloaded scenarios at the expense of transmitting low-power AN.