Researcher profile

John C. S. Lui

John C. S. Lui contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks

In decentralized learning, a network of nodes cooperate to minimize an overall objective function that is usually the finite-sum of their local objectives, and incorporates a non-smooth regularization term for the better generalization ability. Decentralized stochastic proximal gradient (DSPG) method is commonly used to train this type of learning models, while the convergence rate is retarded by the variance of stochastic gradients. In this paper, we propose a novel algorithm, namely DPSVRG, to accelerate the decentralized training by leveraging the variance reduction technique. The basic idea is to introduce an estimator in each node, which tracks the local full gradient periodically, to correct the stochastic gradient at each iteration. By transforming our decentralized algorithm into a centralized inexact proximal gradient algorithm with variance reduction, and controlling the bounds of error sequences, we prove that DPSVRG converges at the rate of $O(1/T)$ for general convex objectives plus a non-smooth term with $T$ as the number of iterations, while DSPG converges at the rate $O(\frac{1}{\sqrt{T}})$. Our experiments on different applications, network topologies and learning models demonstrate that DPSVRG converges much faster than DSPG, and the loss function of DPSVRG decreases smoothly along with the training epochs.

preprint2022arXiv

Federated Online Clustering of Bandits

Contextual multi-armed bandit (MAB) is an important sequential decision-making problem in recommendation systems. A line of works, called the clustering of bandits (CLUB), utilize the collaborative effect over users and dramatically improve the recommendation quality. Owing to the increasing application scale and public concerns about privacy, there is a growing demand to keep user data decentralized and push bandit learning to the local server side. Existing CLUB algorithms, however, are designed under the centralized setting where data are available at a central server. We focus on studying the federated online clustering of bandit (FCLUB) problem, which aims to minimize the total regret while satisfying privacy and communication considerations. We design a new phase-based scheme for cluster detection and a novel asynchronous communication protocol for cooperative bandit learning for this problem. To protect users' privacy, previous differential privacy (DP) definitions are not very suitable, and we propose a new DP notion that acts on the user cluster level. We provide rigorous proofs to show that our algorithm simultaneously achieves (clustered) DP, sublinear communication complexity and sublinear regret. Finally, experimental evaluations show our superior performance compared with benchmark algorithms.

preprint2022arXiv

LPC-AD: Fast and Accurate Multivariate Time Series Anomaly Detection via Latent Predictive Coding

This paper proposes LPC-AD, a fast and accurate multivariate time series (MTS) anomaly detection method. LPC-AD is motivated by the ever-increasing needs for fast and accurate MTS anomaly detection methods to support fast troubleshooting in cloud computing, micro-service systems, etc. LPC-AD is fast in the sense that its reduces the training time by as high as 38.2% compared to the state-of-the-art (SOTA) deep learning methods that focus on training speed. LPC-AD is accurate in the sense that it improves the detection accuracy by as high as 18.9% compared to SOTA sophisticated deep learning methods that focus on enhancing detection accuracy. Methodologically, LPC-AD contributes a generic architecture LPC-Reconstruct for one to attain different trade-offs between training speed and detection accuracy. More specifically, LPC-Reconstruct is built on ideas from autoencoder for reducing redundancy in time series, latent predictive coding for capturing temporal dependence in MTS, and randomized perturbation for avoiding overfitting of anomalous dependence in the training data. We present simple instantiations of LPC-Reconstruct to attain fast training speed, where we propose a simple randomized perturbation method. The superior performance of LPC-AD over SOTA methods is validated by extensive experiments on four large real-world datasets. Experiment results also show the necessity and benefit of each component of the LPC-Reconstruct architecture and that LPC-AD is robust to hyper parameters.

preprint2022arXiv

Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications

Multi-player multi-armed bandits (MMAB) study how decentralized players cooperatively play the same multi-armed bandit so as to maximize their total cumulative rewards. Existing MMAB models mostly assume when more than one player pulls the same arm, they either have a collision and obtain zero rewards, or have no collision and gain independent rewards, both of which are usually too restrictive in practical scenarios. In this paper, we propose an MMAB with shareable resources as an extension to the collision and non-collision settings. Each shareable arm has finite shareable resources and a "per-load" reward random variable, both of which are unknown to players. The reward from a shareable arm is equal to the "per-load" reward multiplied by the minimum between the number of players pulling the arm and the arm's maximal shareable resources. We consider two types of feedback: sharing demand information (SDI) and sharing demand awareness (SDA), each of which provides different signals of resource sharing. We design the DPE-SDI and SIC-SDA algorithms to address the shareable arm problem under these two cases of feedback respectively and prove that both algorithms have logarithmic regrets that are tight in the number of rounds. We conduct simulations to validate both algorithms' performance and show their utilities in wireless networking and edge computing.

preprint2022arXiv

Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms

We generalize the multiple-play multi-armed bandits (MP-MAB) problem with a shareable arm setting, in which several plays can share the same arm. Furthermore, each shareable arm has a finite reward capacity and a ''per-load'' reward distribution, both of which are unknown to the learner. The reward from a shareable arm is load-dependent, which is the "per-load" reward multiplying either the number of plays pulling the arm, or its reward capacity when the number of plays exceeds the capacity limit. When the "per-load" reward follows a Gaussian distribution, we prove a sample complexity lower bound of learning the capacity from load-dependent rewards and also a regret lower bound of this new MP-MAB problem. We devise a capacity estimator whose sample complexity upper bound matches the lower bound in terms of reward means and capacities. We also propose an online learning algorithm to address the problem and prove its regret upper bound. This regret upper bound's first term is the same as regret lower bound's, and its second and third terms also evidently correspond to lower bound's. Extensive experiments validate our algorithm's performance and also its gain in 5G & 4G base station selection.

preprint2022arXiv

Online Competitive Influence Maximization

Online influence maximization has attracted much attention as a way to maximize influence spread through a social network while learning the values of unknown network parameters. Most previous works focus on single-item diffusion. In this paper, we introduce a new Online Competitive Influence Maximization (OCIM) problem, where two competing items (e.g., products, news stories) propagate in the same network and influence probabilities on edges are unknown. We adopt a combinatorial multi-armed bandit (CMAB) framework for OCIM, but unlike the non-competitive setting, the important monotonicity property (influence spread increases when influence probabilities on edges increase) no longer holds due to the competitive nature of propagation, which brings a significant new challenge to the problem. We provide a nontrivial proof showing that the Triggering Probability Modulated (TPM) condition for CMAB still holds in OCIM, which is instrumental for our proposed algorithms OCIM-TS and OCIM-OFU to achieve sublinear Bayesian and frequentist regret, respectively. We also design an OCIM-ETC algorithm that requires less feedback and easier offline computation, at the expense of a worse frequentist regret bound. Experimental evaluations demonstrate the effectiveness of our algorithms.

preprint2020arXiv

Conversational Contextual Bandit: Algorithm and Application

Contextual bandit algorithms provide principled online learning solutions to balance the exploitation-exploration trade-off in various applications such as recommender systems. However, the learning speed of the traditional contextual bandit algorithms is often slow due to the need for extensive exploration. This poses a critical issue in applications like recommender systems, since users may need to provide feedbacks on a lot of uninterested items. To accelerate the learning speed, we generalize contextual bandit to conversational contextual bandit. Conversational contextual bandit leverages not only behavioral feedbacks on arms (e.g., articles in news recommendation), but also occasional conversational feedbacks on key-terms from the user. Here, a key-term can relate to a subset of arms, for example, a category of articles in news recommendation. We then design the Conversational UCB algorithm (ConUCB) to address two challenges in conversational contextual bandit: (1) which key-terms to select to conduct conversation, (2) how to leverage conversational feedbacks to accelerate the speed of bandit learning. We theoretically prove that ConUCB can achieve a smaller regret upper bound than the traditional contextual bandit algorithm LinUCB, which implies a faster learning speed. Experiments on synthetic data, as well as real datasets from Yelp and Toutiao, demonstrate the efficacy of the ConUCB algorithm.

preprint2020arXiv

Online VNF Chaining and Predictive Scheduling: Optimality and Trade-offs

For NFV systems, the key design space includes the function chaining for network requests and resource scheduling for servers. The problem is challenging since NFV systems usually require multiple (often conflicting) design objectives and the computational efficiency of real-time decision making with limited information. Furthermore, the benefits of predictive scheduling to NFV systems still remain unexplored. In this paper, we propose POSCARS, an efficient predictive and online service chaining and resource scheduling scheme that achieves tunable trade-offs among various system metrics with queue stability guarantee. Through a careful choice of granularity in system modeling, we acquire a better understanding of the trade-offs in our design space. By a non-trivial transformation, we decouple the complex optimization problem into a series of online sub-problems to achieve the optimality with only limited information. By employing randomized load balancing techniques, we propose three variants of POSCARS to reduce the overheads of decision making. Theoretical analysis and simulations show that POSCARS and its variants require only mild-value of future information to achieve near-optimal system cost with an ultra-low request response time.

preprint2020arXiv

Quantifying Deployability & Evolvability of Future Internet Architectures via Economic Models

Emerging new applications demand the current Internet to provide new functionalities. Although many future Internet architectures and protocols have been proposed to fulfill such needs, ISPs have been reluctant to deploy many of these architectures. We believe technical issues are not the main reasons as many of these new proposals are technically sound. In this paper, we take an economic perspective and seek to answer: Why most new Internet architectures failed to be deployed? How to enhance the deployability of a new architecture? We develop a game-theoretic model to characterize the outcome of an architecture's deployment through the equilibrium of ISPs' decisions. This model enables us to: (1) analyze several key factors of the deployability of a new architecture such as the number of critical ISPs and the change of routing path; (2) explain the deploying outcomes of some previously proposed architectures/protocols such as IPv6, DiffServ, CDN, etc., and shed light on the "Internet flattening phenomenon"; (3) predict the deployability of a new architecture such as NDN, and compare its deployability with competing architectures. Our study suggests that the difficulty to deploy a new Internet architecture comes from the "coordination" of distributed ISPs. Finally, we design a coordination mechanism to enhance the deployability of new architectures.