Source author record

Fan Lai

Fan Lai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning Artificial Intelligence Distributed, Parallel, and Cluster Computing Networking and Internet Architecture eess.SP Performance

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

EMA: Efficient Model Adaptation for Learning-based Systems

Machine learning (ML) is increasingly applied to optimize system performance in tasks such as resource management and network simulation. Unlike traditional ML tasks (e.g., image classification), networked systems often operate in heterogeneous, long-running, and dynamic environment states, where input conditions (e.g., network loads) and operational objectives can shift over time and across settings. Existing learning-based systems offer little support for adaptation, resulting in costly model training, extensive data collection, degraded system performance, and slow responsiveness. This paper presents EMA, the first model adaptation system supporting learning-based systems to adapt to evolving environments with minimal operational overhead. EMA takes a system-driven, data-centric approach that accommodates diverse system and model designs while addressing two key deployment challenges. First, it reduces expensive model training by introducing state transformers that align the input state of a new environment with previously similar states, allowing models to warm-start adaptation. Second, it addresses the often-overlooked yet costly process of data labeling--collecting ground truth for exploring and training on various system decisions--by prioritizing labeling high-utility data while balancing the tradeoff between training and labeling cost. Evaluations on eight representative learning-based systems show that EMA reduces adaptation costs (e.g., GPU training time) by 14.9-42.4% while improving system performance (e.g., network throughput) by 6.9-31.3%.

preprint2022arXiv

FedScale: Benchmarking Model and System Performance of Federated Learning at Scale

We present FedScale, a federated learning (FL) benchmarking suite with realistic datasets and a scalable runtime to enable reproducible FL research. FedScale datasets encompass a wide range of critical FL tasks, ranging from image classification and object detection to language modeling and speech recognition. Each dataset comes with a unified evaluation protocol using real-world data splits and evaluation metrics. To reproduce realistic FL behavior, FedScale contains a scalable and extensible runtime. It provides high-level APIs to implement FL algorithms, deploy them at scale across diverse hardware and software backends, and evaluate them at scale, all with minimal developer efforts. We combine the two to perform systematic benchmarking experiments and highlight potential opportunities for heterogeneity-aware co-optimizations in FL. FedScale is open-source and actively maintained by contributors from different institutions at http://fedscale.ai. We welcome feedback and contributions from the community.

preprint2022arXiv

Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs

The need to train DNN models on end-user devices (e.g., smartphones) is increasing with the need to improve data privacy and reduce communication overheads. Unlike datacenter servers with powerful CPUs and GPUs, modern smartphones consist of a diverse collection of specialized cores following a system-on-a-chip (SoC) architecture that together perform a variety of tasks. We observe that training DNNs on a smartphone SoC without carefully considering its resource constraints can not only lead to suboptimal training performance but significantly affect user experience as well. In this paper, we present Swan, a neural engine to optimize DNN training on smartphone SoCs without hurting user experience. Extensive large-scale evaluations show that Swan can improve performance by 1.2 - 23.3x over the state-of-the-art.

preprint2020arXiv

A Novel Massive MIMO Beam Domain Channel Model

A novel beam domain channel model (BDCM) for massive multiple-input multiple-output (MIMO) communication systems has been proposed in this paper. The near-field effect and spherical wavefront are firstly assumed in the proposed model, which is different from the conventional BDCM for MIMO based on the far-field effect and plane wavefront assumption. The proposed novel BDCM is the transformation of an existing geometry-based stochastic model (GBSM) from the antenna domain into beam domain. The space-time non-stationarity is also modeled in the novel BDCM. Moreover, the comparison of computational complexity for both models is studied. Based on the numerical analysis, comparison of cluster-level statistical properties between the proposed BDCM and existing GBSM has shown that there exists little difference in the space, time, and frequency correlation properties for two models. Also, based on the simulation, coherence bandwidths of the two models in different scenarios are almost the same. The computational complexity of the novel BDCM is much lower than the existing GBSM. It can be observed that the proposed novel BDCM has similar statistical properties to the existing GBSM at the clusterlevel. The proposed BDCM has less complexity and is therefore more convenient for information theory and signal processing research than the conventional GBSMs.

preprint2016arXiv

A Linear Network Code Construction for General Integer Connections Based on the Constraint Satisfaction Problem

The problem of finding network codes for general connections is inherently difficult in capacity constrained networks. Resource minimization for general connections with network coding is further complicated. Existing methods for identifying solutions mainly rely on highly restricted classes of network codes, and are almost all centralized. In this paper, we introduce linear network mixing coefficients for code constructions of general connections that generalize random linear network coding (RLNC) for multicast connections. For such code constructions, we pose the problem of cost minimization for the subgraph involved in the coding solution and relate this minimization to a path-based Constraint Satisfaction Problem (CSP) and an edge-based CSP. While CSPs are NP-complete in general, we present a path-based probabilistic distributed algorithm and an edge-based probabilistic distributed algorithm with almost sure convergence in finite time by applying Communication Free Learning (CFL). Our approach allows fairly general coding across flows, guarantees no greater cost than routing, and shows a possible distributed implementation. Numerical results illustrate the performance improvement of our approach over existing methods.

preprint2016arXiv

Enhanced VIP Algorithms for Forwarding, Caching, and Congestion Control in Named Data Networks

Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding, caching, and congestion control strategies within the Named Data Networking (NDN) architecture. While the existing VIP algorithms exhibit good performance, they are primarily focused on maximizing network throughput and utility, and do not explicitly consider user delay. In this paper, we develop a new class of enhanced algorithms for joint dynamic forwarding, caching and congestion control within the VIP framework. These enhanced VIP algorithms adaptively stabilize the network and maximize network utility, while improving the delay performance by intelligently making use of VIP information beyond one hop. Generalizing Lyapunov drift techniques, we prove the throughput optimality and characterize the utility-delay tradeoff of the enhanced VIP algorithms. Numerical experiments demonstrate the superior performance of the resulting enhanced algorithms for handling Interest Packets and Data Packets within the actual plane, in terms of low network delay and high network utility.

preprint2016arXiv

Optimal Caching and User Association in Cache-enabled Heterogeneous Wireless Networks

Heterogenous wireless networks (Hetnets) provide a powerful approach to meet the massive growth in traffic demands, but also impose a significant challenge on backhaul. Caching at small base stations (BSs) and wireless small cell backhaul have been proposed as attractive solutions to address this new challenge. In this paper, we consider the optimal caching and user association to minimize the total time to satisfy the average demands in cached-enabled Hetnets with wireless backhaul. We formulate this problem as a mixed discrete-continuous optimization for given bandwidth and cache resources. First, we characterize the structure of the optimal solution. Specifically, we show that the optimal caching is to store the most popular files at each pico BS, and the optimal user association has a threshold form. We also obtain the closed-form optimal solution in the homogenous scenario of pico cells. Then, we analyze the impact of bandwidth and cache resources on the minimum total time to satisfy the average demands. Finally, using numerical simulations, we verify the analytical results.

preprint2016arXiv

Scaled VIP Algorithms for Joint Dynamic Forwarding and Caching in Named Data Networks

Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding and caching within the Named Data Networking (NDN) architecture. The virtual plane of the VIP framework captures the measured demand for content objects, but does not reflect interest collapse and suppression in the NDN network. We aim to further improve the performance of the existing VIP algorithms by using a modified virtual plane where VIP counts are appropriately scaled to reflect interest suppression effects. We characterize the stability region of the modified virtual plane with VIP scaling, develop a new distributed forwarding and caching algorithm operating on the scaled VIPs, and demonstrate the throughput optimality of the scaled VIP algorithm in the virtual plane. Numerical experiments demonstrate significantly enhanced performance relative to the existing VIP algorithm, as well as a number of other baseline algorithms.

Fan Lai

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

EMA: Efficient Model Adaptation for Learning-based Systems

FedScale: Benchmarking Model and System Performance of Federated Learning at Scale

Swan: A Neural Engine for Efficient DNN Training on Smartphone SoCs

A Novel Massive MIMO Beam Domain Channel Model

A Linear Network Code Construction for General Integer Connections Based on the Constraint Satisfaction Problem

Enhanced VIP Algorithms for Forwarding, Caching, and Congestion Control in Named Data Networks

Optimal Caching and User Association in Cache-enabled Heterogeneous Wireless Networks

Scaled VIP Algorithms for Joint Dynamic Forwarding and Caching in Named Data Networks