Source author record

Bai Liu

Bai Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Machine Learning math.OC Networking and Internet Architecture Performance

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Easy and Efficient Transformer : Scalable Inference Solution For large NLP model

Recently, large-scale transformer-based models have been proven to be effective over various tasks across many domains. Nevertheless, applying them in industrial production requires tedious and heavy works to reduce inference costs. To fill such a gap, we introduce a scalable inference solution: Easy and Efficient Transformer (EET), including a series of transformer inference optimization at the algorithm and implementation levels. First, we design highly optimized kernels for long inputs and large hidden sizes. Second, we propose a flexible CUDA memory manager to reduce the memory footprint when deploying a large model. Compared with the state-of-the-art transformer inference library (Faster Transformer v4.0), EET can achieve an average of 1.40-4.20x speedup on the transformer decoder layer with an A100 GPU

preprint2022arXiv

RL-QN: A Reinforcement Learning Framework for Optimal Control of Queueing Systems

With the rapid advance of information technology, network systems have become increasingly complex and hence the underlying system dynamics are often unknown or difficult to characterize. Finding a good network control policy is of significant importance to achieve desirable network performance (e.g., high throughput or low delay). In this work, we consider using model-based reinforcement learning (RL) to learn the optimal control policy for queueing networks so that the average job delay (or equivalently the average queue backlog) is minimized. Traditional approaches in RL, however, cannot handle the unbounded state spaces of the network control problem. To overcome this difficulty, we propose a new algorithm, called Reinforcement Learning for Queueing Networks (RL-QN), which applies model-based RL methods over a finite subset of the state space, while applying a known stabilizing policy for the rest of the states. We establish that the average queue backlog under RL-QN with an appropriately constructed subset can be arbitrarily close to the optimal result. We evaluate RL-QN in dynamic server allocation, routing and switching problems. Simulation results show that RL-QN minimizes the average queue backlog effectively.

preprint2016arXiv

Global optimization framework for real-time route guidance via variable message sign

Variable message sign (VMS) is an effective traffic management tool for congestion mitigation. The VMS is primarily used as a means of providing factual travel information or genuine route guidance to travelers. However, this may be rendered sub-optimal on a network level by potential network paradoxes and lack of consideration for its cascading effect on the rest of the network. This paper focuses on the design of optimal display strategy of VMS in response to real-time traffic information and its coordination with other intelligent transportation systems such as signal control, in order to explore the full potential of real-time route guidance in combating congestion. We invoke the linear decision rule framework to design the optimal on-line VMS strategy, and test its effectiveness in conjunction with on-line signal control. A simulation case study is conducted on a real-world test network in China, which shows the advantage of the proposed adaptive VMS display strategy over genuine route guidance, as well as its synergies with on-line signal control for congestion mitigation.