Source author record

Baochun Li

Baochun Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Networking and Internet Architecture Distributed, Parallel, and Cluster Computing Computer Vision Cryptography and Security Performance Information Theory math.IT

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Optimal Streaming Erasure Codes over the Three-Node Relay Network

This paper investigates low-latency streaming codes for a three-node relay network. The source transmits a sequence of messages (streaming messages) to the destination through the relay between them, where the first-hop channel from the source to the relay and the second-hop channel from the relay to the destination are subject to packet erasures. Every source message must be recovered perfectly at the destination subject to a fixed decoding delay of $T$ time slots. In any sliding window of $T+1$ time slots, we assume no more than $N_1$ and $N_2$ erasures are introduced by the first-hop channel and second-hop channel respectively. Under this channel loss assumption, we fully characterize the maximum achievable rate in terms of $T$, $N_1$ and $N_2$. The achievability is proved by using a symbol-wise decode-forward strategy where the source symbols within the same message are decoded by the relay with different delays. The converse is proved by analyzing the maximum achievable rate for each channel when the erasures in the other channel are consecutive (bursty). In addition, we show that traditional message-wise decode-forward strategies, which require the source symbols within the same message to be decoded by the relay with the same delay, are sub-optimal in general.

preprint2022arXiv

OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks

This paper proposes a new eXplanation framework, called OrphicX, for generating causal explanations for any graph neural networks (GNNs) based on learned latent causal factors. Specifically, we construct a distinct generative model and design an objective function that encourages the generative model to produce causal, compact, and faithful explanations. This is achieved by isolating the causal factors in the latent space of graphs by maximizing the information flow measurements. We theoretically analyze the cause-effect relationships in the proposed causal graph, identify node attributes as confounders between graphs and GNN predictions, and circumvent such confounder effect by leveraging the backdoor adjustment formula. Our framework is compatible with any GNNs, and it does not require access to the process by which the target GNN produces its predictions. In addition, it does not rely on the linear-independence assumption of the explained features, nor require prior knowledge on the graph learning tasks. We show a proof-of-concept of OrphicX on canonical classification problems on graph data. In particular, we analyze the explanatory subgraphs obtained from explanations for molecular graphs (i.e., Mutag) and quantitatively evaluate the explanation performance with frequently occurring subgraph patterns. Empirically, we show that OrphicX can effectively identify the causal semantics for generating causal explanations, significantly outperforming its alternatives.

preprint2022arXiv

Pisces: Efficient Federated Learning via Guided Asynchronous Training

Federated learning (FL) is typically performed in a synchronous parallel manner, where the involvement of a slow client delays a training iteration. Current FL systems employ a participant selection strategy to select fast clients with quality data in each iteration. However, this is not always possible in practice, and the selection strategy often has to navigate an unpleasant trade-off between the speed and the data quality of clients. In this paper, we present Pisces, an asynchronous FL system with intelligent participant selection and model aggregation for accelerated training. To avoid incurring excessive resource cost and stale training computation, Pisces uses a novel scoring mechanism to identify suitable clients to participate in a training iteration. It also adapts the pace of model aggregation to dynamically bound the progress gap between the selected clients and the server, with a provable convergence guarantee in a smooth non-convex setting. We have implemented Pisces in an open-source FL platform called Plato, and evaluated its performance in large-scale experiments with popular vision and language models. Pisces outperforms the state-of-the-art synchronous and asynchronous schemes, accelerating the time-to-accuracy by up to 2.0x and 1.9x, respectively.

preprint2022arXiv

Towards Private Learning on Decentralized Graphs with Local Differential Privacy

Many real-world networks are inherently decentralized. For example, in social networks, each user maintains a local view of a social graph, such as a list of friends and her profile. It is typical to collect these local views of social graphs and conduct graph learning tasks. However, learning over graphs can raise privacy concerns as these local views often contain sensitive information. In this paper, we seek to ensure private graph learning on a decentralized network graph. Towards this objective, we propose {\em Solitude}, a new privacy-preserving learning framework based on graph neural networks (GNNs), with formal privacy guarantees based on edge local differential privacy. The crux of {\em Solitude} is a set of new delicate mechanisms that can calibrate the introduced noise in the decentralized graph collected from the users. The principle behind the calibration is the intrinsic properties shared by many real-world graphs, such as sparsity. Unlike existing work on locally private GNNs, our new framework can simultaneously protect node feature privacy and edge privacy, and can seamlessly incorporate with any GNN with privacy-utility guarantees. Extensive experiments on benchmarking datasets show that {\em Solitude} can retain the generalization capability of the learned GNN while preserving the users' data privacy under given privacy budgets.

preprint2020arXiv

Low-Latency Network-Adaptive Error Control for Interactive Streaming

We introduce a novel network-adaptive algorithm that is suitable for alleviating network packet losses for low-latency interactive communications between a source and a destination. Our network-adaptive algorithm estimates in real-time the best parameters of a recently proposed streaming code that uses forward error correction (FEC) to correct both arbitrary and burst losses, which cause a crackling noise and undesirable jitters, respectively in audio. In particular, the destination estimates appropriate coding parameters based on its observed packet loss pattern and sends them back to the source for updating the underlying code. Besides, a new explicit construction of practical low-latency streaming codes that achieve the optimal tradeoff between the capability of correcting arbitrary losses and the capability of correcting burst losses is provided. Simulation evaluations based on statistical losses and real-world packet loss traces reveal the following: (i) Our proposed network-adaptive algorithm combined with our optimal streaming codes can achieve significantly higher performance compared to uncoded and non-adaptive FEC schemes over UDP (User Datagram Protocol); (ii) Our explicit streaming codes can significantly outperform traditional MDS (maximum-distance separable) streaming schemes when they are used along with our network-adaptive algorithm.

preprint2020arXiv

Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data

Graph-based semi-supervised learning has been shown to be one of the most effective approaches for classification tasks from a wide range of domains, such as image classification and text classification, as they can exploit the connectivity patterns between labeled and unlabeled samples to improve learning performance. In this work, we advance this effective learning paradigm towards a scenario where labeled data are severely limited. More specifically, we address the problem of graph-based semi-supervised learning in the presence of severely limited labeled samples, and propose a new framework, called {\em Shoestring}, that improves the learning performance through semantic transfer from these very few labeled samples to large numbers of unlabeled samples. In particular, our framework learns a metric space in which classification can be performed by computing the similarity to centroid embedding of each class. {\em Shoestring} is trained in an end-to-end fashion to learn to leverage the semantic knowledge of limited labeled samples as well as their connectivity patterns with large numbers of unlabeled samples simultaneously. By combining {\em Shoestring} with graph convolutional networks, label propagation and their recent label-efficient variations (IGCN and GLP), we are able to achieve state-of-the-art node classification performance in the presence of very few labeled samples. In addition, we demonstrate the effectiveness of our framework on image classification tasks in the few-shot learning regime, with significant gains on miniImageNet ($2.57\%\sim3.59\%$) and tieredImageNet ($1.05\%\sim2.70\%$).

preprint2020arXiv

Towards Assessment of Randomized Smoothing Mechanisms for Certifying Adversarial Robustness

As a certified defensive technique, randomized smoothing has received considerable attention due to its scalability to large datasets and neural networks. However, several important questions remain unanswered, such as (i) whether the Gaussian mechanism is an appropriate option for certifying $\ell_2$-norm robustness, and (ii) whether there is an appropriate randomized (smoothing) mechanism to certify $\ell_\infty$-norm robustness. To shed light on these questions, we argue that the main difficulty is how to assess the appropriateness of each randomized mechanism. In this paper, we propose a generic framework that connects the existing frameworks in \cite{lecuyer2018certified, li2019certified}, to assess randomized mechanisms. Under our framework, for a randomized mechanism that can certify a certain extent of robustness, we define the magnitude of its required additive noise as the metric for assessing its appropriateness. We also prove lower bounds on this metric for the $\ell_2$-norm and $\ell_\infty$-norm cases as the criteria for assessment. Based on our framework, we assess the Gaussian and Exponential mechanisms by comparing the magnitude of additive noise required by these mechanisms and the lower bounds (criteria). We first conclude that the Gaussian mechanism is indeed an appropriate option to certify $\ell_2$-norm robustness. Surprisingly, we show that the Gaussian mechanism is also an appropriate option for certifying $\ell_\infty$-norm robustness, instead of the Exponential mechanism. Finally, we generalize our framework to $\ell_p$-norm for any $p\geq2$. Our theoretical findings are verified by evaluations on CIFAR10 and ImageNet.

preprint2020arXiv

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition

Skeleton-based action recognition has attracted increasing attention due to its strong adaptability to dynamic circumstances and potential for broad applications such as autonomous and anonymous surveillance. With the help of deep learning techniques, it has also witnessed substantial progress and currently achieved around 90\% accuracy in benign environment. On the other hand, research on the vulnerability of skeleton-based action recognition under different adversarial settings remains scant, which may raise security concerns about deploying such techniques into real-world systems. However, filling this research gap is challenging due to the unique physical constraints of skeletons and human actions. In this paper, we attempt to conduct a thorough study towards understanding the adversarial vulnerability of skeleton-based action recognition. We first formulate generation of adversarial skeleton actions as a constrained optimization problem by representing or approximating the physiological and physical constraints with mathematical formulations. Since the primal optimization problem with equality constraints is intractable, we propose to solve it by optimizing its unconstrained dual problem using ADMM. We then specify an efficient plug-in defense, inspired by recent theories and empirical observations, against the adversarial skeleton actions. Extensive evaluations demonstrate the effectiveness of the attack and defense method under different settings.

preprint2016arXiv

An Alternating Direction Method Approach to Cloud Traffic Management

In this paper, we introduce a unified framework for studying various cloud traffic management problems, ranging from geographical load balancing to backbone traffic engineering. We first abstract these real-world problems as a multi-facility resource allocation problem, and then present two distributed optimization algorithms by exploiting the special structure of the problem. Our algorithms are inspired by Alternating Direction Method of Multipliers (ADMM), enjoying a number of unique features. Compared to dual decomposition, they converge with non-strictly convex objective functions; compared to other ADMM-type algorithms, they not only achieve faster convergence under weaker assumptions, but also have lower computational complexity and lower message-passing overhead. The simulation results not only confirm these desirable features of our algorithms, but also highlight several additional advantages, such as scalability and fault-tolerance.

preprint2015arXiv

Dispersing Instant Social Video Service Across Multiple Clouds

Instant social video sharing which combines the online social network and user-generated short video streaming services, has become popular in today's Internet. Cloud-based hosting of such instant social video contents has become a norm to serve the increasing users with user-generated contents. A fundamental problem of cloud-based social video sharing service is that users are located globally, who cannot be served with good service quality with a single cloud provider. In this paper, we investigate the feasibility of dispersing instant social video contents to multiple cloud providers. The challenge is that inter-cloud social \emph{propagation} is indispensable with such multi-cloud social video hosting, yet such inter-cloud traffic incurs substantial operational cost. We analyze and formulate the multi-cloud hosting of an instant social video system as an optimization problem. We conduct large-scale measurement studies to show the characteristics of instant social video deployment, and demonstrate the trade-off between satisfying users with their ideal cloud providers, and reducing the inter-cloud data propagation. Our measurement insights of the social propagation allow us to propose a heuristic algorithm with acceptable complexity to solve the optimization problem, by partitioning a propagation-weighted social graph in two phases: a preference-aware initial cloud provider selection and a propagation-aware re-hosting. Our simulation experiments driven by real-world social network traces show the superiority of our design.

preprint2013arXiv

Dominant Resource Fairness in Cloud Computing Systems with Heterogeneous Servers

We study the multi-resource allocation problem in cloud computing systems where the resource pool is constructed from a large number of heterogeneous servers, representing different points in the configuration space of resources such as processing, memory, and storage. We design a multi-resource allocation mechanism, called DRFH, that generalizes the notion of Dominant Resource Fairness (DRF) from a single server to multiple heterogeneous servers. DRFH provides a number of highly desirable properties. With DRFH, no user prefers the allocation of another user; no one can improve its allocation without decreasing that of the others; and more importantly, no user has an incentive to lie about its resource demand. As a direct application, we design a simple heuristic that implements DRFH in real-world systems. Large-scale simulations driven by Google cluster traces show that DRFH significantly outperforms the traditional slot-based scheduler, leading to much higher resource utilization with substantially shorter job completion times.

preprint2013arXiv

Reducing Electricity Demand Charge for Data Centers with Partial Execution

Data centers consume a large amount of energy and incur substantial electricity cost. In this paper, we study the familiar problem of reducing data center energy cost with two new perspectives. First, we find, through an empirical study of contracts from electric utilities powering Google data centers, that demand charge per kW for the maximum power used is a major component of the total cost. Second, many services such as Web search tolerate partial execution of the requests because the response quality is a concave function of processing time. Data from Microsoft Bing search engine confirms this observation. We propose a simple idea of using partial execution to reduce the peak power demand and energy cost of data centers. We systematically study the problem of scheduling partial execution with stringent SLAs on response quality. For a single data center, we derive an optimal algorithm to solve the workload scheduling problem. In the case of multiple geo-distributed data centers, the demand of each data center is controlled by the request routing algorithm, which makes the problem much more involved. We decouple the two aspects, and develop a distributed optimization algorithm to solve the large-scale request routing problem. Trace-driven simulations show that partial execution reduces cost by $3\%--10.5\%$ for one data center, and by $15.5\%$ for geo-distributed data centers together with request routing.

preprint2013arXiv

RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

Short TCP flows that are critical for many interactive applications in data centers are plagued by large flows and head-of-line blocking in switches. Hash-based load balancing schemes such as ECMP aggravate the matter and result in long-tailed flow completion times (FCT). Previous work on reducing FCT usually requires custom switch hardware and/or protocol changes. We propose RepFlow, a simple yet practically effective approach that replicates each short flow to reduce the completion times, without any change to switches or host kernels. With ECMP the original and replicated flows traverse distinct paths with different congestion levels, thereby reducing the probability of having long queueing delay. We develop a simple analytical model to demonstrate the potential improvement of RepFlow. Extensive NS-3 simulations and Mininet implementation show that RepFlow provides 50%--70% speedup in both mean and 99-th percentile FCT for all loads, and offers near-optimal FCT when used with DCTCP.

preprint2013arXiv

Spot Transit: Cheaper Internet Transit for Elastic Traffic

We advocate to create a \emph{spot} Internet transit market, where transit is sold using the under-utilized backbone capacity at a lower price. The providers can improve profit by capitalizing the perishable capacity, and customers can buy transit on-demand without a minimum commitment level for elastic traffic, and as a result improve its surplus (i.e. utility gains). We conduct a systematic study of the economical benefits of spot transit both theoretically and empirically. We propose a simple analytical framework with a general demand function, and solve the pricing problem of maximizing the expected profit, taking into account the revenue loss of regular transit when spot transit traffic hikes. We rigorously prove the price advantage of spot transit, as well as profit and surplus improvements for tier-1 ISPs and customers, respectively. Using real-world price data and traffic statistics of 6 IXPs with more than 1000 ISPs, we quantitatively evaluate spot transit and show that significant financial benefits can be achieved in both absolute and relative terms, robust to parameter values.

preprint2013arXiv

To Reserve or Not to Reserve: Optimal Online Multi-Instance Acquisition in IaaS Clouds

Infrastructure-as-a-Service (IaaS) clouds offer diverse instance purchasing options. A user can either run instances on demand and pay only for what it uses, or it can prepay to reserve instances for a long period, during which a usage discount is entitled. An important problem facing a user is how these two instance options can be dynamically combined to serve time-varying demands at minimum cost. Existing strategies in the literature, however, require either exact knowledge or the distribution of demands in the long-term future, which significantly limits their use in practice. Unlike existing works, we propose two practical online algorithms, one deterministic and another randomized, that dynamically combine the two instance options online without any knowledge of the future. We show that the proposed deterministic (resp., randomized) algorithm incurs no more than 2-alpha (resp., e/(e-1+alpha)) times the minimum cost obtained by an optimal offline algorithm that knows the exact future a priori, where alpha is the entitled discount after reservation. Our online algorithms achieve the best possible competitive ratios in both the deterministic and randomized cases, and can be easily extended to cases when short-term predictions are reliable. Simulations driven by a large volume of real-world traces show that significant cost savings can be achieved with prevalent IaaS prices.

Baochun Li

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Optimal Streaming Erasure Codes over the Three-Node Relay Network

OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks

Pisces: Efficient Federated Learning via Guided Asynchronous Training

Towards Private Learning on Decentralized Graphs with Local Differential Privacy

Low-Latency Network-Adaptive Error Control for Interactive Streaming

Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data

Towards Assessment of Randomized Smoothing Mechanisms for Certifying Adversarial Robustness

Towards Understanding the Adversarial Vulnerability of Skeleton-based Action Recognition

An Alternating Direction Method Approach to Cloud Traffic Management

Dispersing Instant Social Video Service Across Multiple Clouds

Dominant Resource Fairness in Cloud Computing Systems with Heterogeneous Servers

Reducing Electricity Demand Charge for Data Centers with Partial Execution

RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

Spot Transit: Cheaper Internet Transit for Elastic Traffic

To Reserve or Not to Reserve: Optimal Online Multi-Instance Acquisition in IaaS Clouds