Source author record

Shi Dong

Shi Dong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Distributed, Parallel, and Cluster Computing Information Theory math.IT math.OC Networking and Internet Architecture

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Information-theoretic Bayesian regret bounds of Russo and Van Roy capture the dependence of regret on prior uncertainty. However, this dependence is through entropy, which can become arbitrarily large as the number of actions increases. We establish new bounds that depend instead on a notion of rate-distortion. Among other things, this allows us to recover through information-theoretic arguments a near-optimal bound for the linear bandit. We also offer a bound for the logistic bandit that dramatically improves on the best previously available, though this bound depends on an information-theoretic statistic that we have only been able to quantify via computation.

preprint2020arXiv

Provably Efficient Reinforcement Learning with Aggregated States

We establish that an optimistic variant of Q-learning applied to a fixed-horizon episodic Markov decision process with an aggregated state representation incurs regret $\tilde{\mathcal{O}}(\sqrt{H^5 M K} + εHK)$, where $H$ is the horizon, $M$ is the number of aggregate states, $K$ is the number of episodes, and $ε$ is the largest difference between any pair of optimal state-action values associated with a common aggregate state. Notably, this regret bound does not depend on the number of states or actions and indicates that asymptotic per-period regret is no greater than $ε$, independent of horizon. To our knowledge, this is the first such result that applies to reinforcement learning with nontrivial value function approximation without any restrictions on transition probabilities.

preprint2020arXiv

Summarizing CPU and GPU Design Trends with Product Data

Moore's Law and Dennard Scaling have guided the semiconductor industry for the past few decades. Recently, both laws have faced validity challenges as transistor sizes approach the practical limits of physics. We are interested in testing the validity of these laws and reflect on the reasons responsible. In this work, we collect data of more than 4000 publicly-available CPU and GPU products. We find that transistor scaling remains critical in keeping the laws valid. However, architectural solutions have become increasingly important and will play a larger role in the future. We observe that GPUs consistently deliver higher performance than CPUs. GPU performance continues to rise because of increases in GPU frequency, improvements in the thermal design power (TDP), and growth in die size. But we also see the ratio of GPU to CPU performance moving closer to parity, thanks to new SIMD extensions on CPUs and increased CPU core counts.

preprint2013arXiv

Network traffic prediction based on ARFIMA model

ARFIMA is a time series forecasting model, which is an improved ARMA model, the ARFIMA model proposed in this article is demonstrated and deduced in detail. combined with network traffic of CERNET backbone and the ARFIMA model,the result shows that,compare to the ARMA model, the prediction efficiency and accuracy has increased significantly, and not susceptible to sampling.

Shi Dong

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Provably Efficient Reinforcement Learning with Aggregated States

Summarizing CPU and GPU Design Trends with Product Data

Network traffic prediction based on ARFIMA model