Researcher profile

Guoming Tang

Guoming Tang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

BandPilot: Towards Performance- and Contention-Aware GPU Dispatching in AI Clusters

Modern multi-tenant AI clusters are increasingly communication-bound, driven by high-volume and multi-round GPU-to-GPU collective communication. Consequently, the GPU dispatcher's choice of a physical GPU subset for each tenant largely determines the job's effective collective bandwidth and thus its performance ceiling. Existing dispatchers predominantly rely on static, topology-aware heuristics that prioritize GPU resource compactness, assuming that minimizing physical distance maximizes communication bandwidth. However, we reveal that this assumption often fails due to complex system-level bottlenecks, such as non-linear NIC saturation and inter-node link heterogeneity.This paper presents BandPilot, a performance- and contention-aware GPU dispatching primitive that optimizes effective collective bandwidth for multi-tenant AI clusters. Specifically, BandPilot learns a data-efficient bandwidth model from sparse NCCL measurements via a hierarchical design. Guided by the model, a fast hybrid search combines an equilibrium-driven constructor with a pruned elimination search to navigate the combinatorial allocation space in real time. To account for multi-tenant interference, BandPilot virtually merges a candidate allocation with co-located cross-host jobs to conservatively estimate shared bottleneck capacity and predict contention-degraded bandwidth. Across a 32-GPU H100 cluster and heterogeneous simulations, BandPilot achieves 92-97% bandwidth efficiency relative to the best-found reference, improving average efficiency by 20-40% over topology-compactness heuristics.

preprint2020arXiv

Edge Federation: Towards an Integrated Service Provisioning Model

Edge computing is a promising computing paradigm for pushing the cloud service to the network edge. To this end, edge infrastructure providers (EIPs) need to bring computation and storage resources to the network edge and allow edge service providers (ESPs) to provision latency-critical services to users. Currently, EIPs prefer to establish a series of private edge-computing environments to serve specific requirements of users. This kind of resource provisioning mechanism severely limits the development and spread of edge computing for serving diverse user requirements. To this end, we propose an integrated resource provisioning model, named edge federation, to seamlessly realize the resource cooperation and service provisioning across standalone edge computing providers and clouds. To efficiently schedule and utilize the resources across multiple EIPs, we systematically characterize the provisioning process as a large-scale linear programming (LP) problem and transform it into an easily solved form. Accordingly, we design a dynamic algorithm to tackle the varying service demands from users. We conduct extensive experiments over the base station networks in Toronto city. Compared with the existing fixed contract model and multihoming model, edge federation can reduce the overall cost of EIPs by 23.3% to 24.5%, and 15.5% to 16.3%, respectively.

preprint2020arXiv

PLVER: Joint Stable Allocation and Content Replication for Edge-assisted Live Video Delivery

The live streaming services have gained extreme popularity in recent years. Due to the spiky traffic patterns of live videos, utilizing the distributed edge servers to improve viewers' quality of experience (QoE) has become a common practice nowadays. Nevertheless, current client-driven content caching mechanism does not support caching beforehand from the cloud to the edge, resulting in considerable cache missing in live video delivery. State-of-the-art research generally sacrifices the liveness of delivered videos in order to deal with the above problem. In this paper, by jointly considering the features of live videos and edge servers, we propose PLVER, a proactive live video push scheme to resolve the cache miss problem in live video delivery. Specifically, PLVER first conducts a one-tomultiple stable allocation between edge clusters and user groups, to balance the load of live traffic over the edge servers. Then it adopts proactive video replication algorithms to speed up the video replication among the edge servers. We conduct extensive trace-driven evaluations, covering 0.3 million Twitch viewers and more than 300 Twitch channels. The results demonstrate that with PLVER, edge servers can carry 28% and 82% more traffic than the auction-based replication method and the caching on requested time method, respectively.

preprint2020arXiv

Quantifying Low-Battery Anxiety of Mobile Users and Its Impacts on Video Watching Behavior

People nowadays are increasingly dependent on mobile phones for daily communication, study, and business. Along with this it incurs the low-battery anxiety (LBA). Although having been unveiled for a while, LBA has not been thoroughly investigated yet. Without a better understanding of LBA, it would be difficult to precisely validate energy saving and management techniques in terms of alleviating LBA and enhancing Quality of Experience (QoE) of mobile users. To fill the gap, we conduct an investigation over 2000+ mobile users, look into their feelings and reactions towards LBA, and quantify their anxiety degree during the draining of battery power. As a case study, we also investigate the impact of LBA on user's behavior at video watching, and with the massive collected answers we are able to quantify user's abandoning likelihood of attractive videos versus the battery status of mobile phone. The empirical findings and quantitative models obtained in this work not only disclose the characteristics of LBA among modern mobile users, but also provide valuable references for the design, evaluation, and improvement of QoE-aware mobile applications and services.