Source author record

Kuang Xu

Kuang Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning math.OC Networking and Internet Architecture Performance Computer Science and Game Theory Cryptography and Security Distributed, Parallel, and Cluster Computing econ.EM Information Theory math.IT Methodology stat.OT

Catalog footprint

What is connected

12works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Temporal Concatenation for Markov Decision Processes

We propose and analyze a temporal concatenation heuristic for solving large-scale finite-horizon Markov decision processes (MDP), which divides the MDP into smaller sub-problems along the time horizon and generates an overall solution by simply concatenating the optimal solutions from these sub-problems. As a "black box" architecture, temporal concatenation works with a wide range of existing MDP algorithms. Our main results characterize the regret of temporal concatenation compared to the optimal solution. We provide upper bounds for general MDP instances, as well as a family of MDP instances in which the upper bounds are shown to be tight. Together, our results demonstrate temporal concatenation's potential of substantial speed-up at the expense of some performance degradation.

preprint2021arXiv

A Bit Better? Quantifying Information for Bandit Learning

The information ratio offers an approach to assessing the efficacy with which an agent balances between exploration and exploitation. Originally, this was defined to be the ratio between squared expected regret and the mutual information between the environment and action-observation pair, which represents a measure of information gain. Recent work has inspired consideration of alternative information measures, particularly for use in analysis of bandit learning algorithms to arrive at tighter regret bounds. We investigate whether quantification of information via such alternatives can improve the realized performance of information-directed sampling, which aims to minimize the information ratio.

preprint2021arXiv

Hierarchical Causal Bandit

Causal bandit is a nascent learning model where an agent sequentially experiments in a causal network of variables, in order to identify the reward-maximizing intervention. Despite the model's wide applicability, existing analytical results are largely restricted to a parallel bandit version where all variables are mutually independent. We introduce in this work the hierarchical causal bandit model as a viable path towards understanding general causal bandits with dependent variables. The core idea is to incorporate a contextual variable that captures the interaction among all variables with direct effects. Using this hierarchical framework, we derive sharp insights on algorithmic design in causal bandits with dependent arms and obtain nearly matching regret bounds in the case of a binary context.

preprint2020arXiv

Anonymous Stochastic Routing

We propose and analyze a recipient-anonymous stochastic routing model to study a fundamental trade-off between anonymity and routing delay. An agent wants to quickly reach a goal vertex in a network through a sequence of routing actions, while an overseeing adversary observes the agent's entire trajectory and tries to identify her goal among those vertices traversed. We are interested in understanding the probability that the adversary can correctly identify the agent's goal (anonymity), as a function of the time it takes the agent to reach it (delay). A key feature of our model is the presence of intrinsic uncertainty in the environment, so that each of the agent's intended steps is subject to random perturbation and thus may not materialize as planned. Using large-network asymptotics, our main results provide near-optimal characterization of the anonymity-delay trade-off under a number of network topologies. Our main technical contributions are centered around a new class of "noise-harnessing" routing strategies that adaptively combine intrinsic uncertainty from the environment with additional artificial randomization to achieve provably efficient obfuscation.

preprint2020arXiv

Experimenting in Equilibrium

Classical approaches to experimental design assume that intervening on one unit does not affect other units. There are many important settings, however, where this non-interference assumption does not hold, as when running experiments on supply-side incentives on a ride-sharing platform or subsidies in an energy marketplace. In this paper, we introduce a new approach to experimental design in large-scale stochastic systems with considerable cross-unit interference, under an assumption that the interference is structured enough that it can be captured via mean-field modeling. Our approach enables us to accurately estimate the effect of small changes to system parameters by combining unobstrusive randomization with lightweight modeling, all while remaining in equilibrium. We can then use these estimates to optimize the system by gradient descent. Concretely, we focus on the problem of a platform that seeks to optimize supply-side payments p in a centralized marketplace where different suppliers interact via their effects on the overall supply-demand equilibrium, and show that our approach enables the platform to optimize p in large systems using vanishingly small perturbations.

preprint2020arXiv

Optimal query complexity for private sequential learning against eavesdropping

We study the query complexity of a learner-private sequential learning problem, motivated by the privacy and security concerns due to eavesdropping that arise in practical applications such as pricing and Federated Learning. A learner tries to estimate an unknown scalar value, by sequentially querying an external database and receiving binary responses; meanwhile, a third-party adversary observes the learner's queries but not the responses. The learner's goal is to design a querying strategy with the minimum number of queries (optimal query complexity) so that she can accurately estimate the true value, while the eavesdropping adversary even with the complete knowledge of her querying strategy cannot. We develop new querying strategies and analytical techniques and use them to prove tight upper and lower bounds on the optimal query complexity. The bounds almost match across the entire parameter range, substantially improving upon existing results. We thus obtain a complete picture of the optimal query complexity as a function of the estimation accuracy and the desired levels of privacy. We also extend the results to sequential learning models in higher dimensions, and where the binary responses are noisy. Our analysis leverages a crucial insight into the nature of private learning problem, which suggests that the query trajectory of an optimal learner can be divided into distinct phases that focus on pure learning versus learning and obfuscation, respectively.

preprint2020arXiv

Private Sequential Learning

We formulate a private learning model to study an intrinsic tradeoff between privacy and query complexity in sequential learning. Our model involves a learner who aims to determine a scalar value, $v^*$, by sequentially querying an external database and receiving binary responses. In the meantime, an adversary observes the learner's queries, though not the responses, and tries to infer from them the value of $v^*$. The objective of the learner is to obtain an accurate estimate of $v^*$ using only a small number of queries, while simultaneously protecting her privacy by making $v^*$ provably difficult to learn for the adversary. Our main results provide tight upper and lower bounds on the learner's query complexity as a function of desired levels of privacy and estimation accuracy. We also construct explicit query strategies whose complexity is optimal up to an additive constant.

preprint2016arXiv

On the capacity of information processing systems

We propose and analyze a family of information processing systems, where a finite set of experts or servers are employed to extract information about a stream of incoming jobs. Each job is associated with a hidden label drawn from some prior distribution. An inspection by an expert produces a noisy outcome that depends both on the job's hidden label and the type of the expert, and occupies the expert for a finite time duration. A decision maker's task is to dynamically assign inspections so that the resulting outcomes can be used to accurately recover the labels of all jobs, while keeping the system stable. Among our chief motivations are applications in crowd-sourcing, diagnostics, and experiment designs, where one wishes to efficiently learn the nature of a large number of items, using a finite pool of computational resources or human agents. We focus on the capacity of such an information processing system. Given a level of accuracy guarantee, we ask how many experts are needed in order to stabilize the system, and through what inspection architecture. Our main result provides an adaptive inspection policy that is asymptotically optimal in the following sense: the ratio between the required number of experts under our policy and the theoretical optimal converges to one, as the probability of error in label recovery tends to zero.

preprint2015arXiv

Necessity of Future Information in Admission Control

We study the necessity of predictive information in a class of queueing admission control problems, where a system manager is allowed to divert incoming jobs up to a fixed rate, in order to minimize the queueing delay experienced by the admitted jobs. Spencer et al. (2014) show that the system's delay performance can be significantly improved by having access to future information in the form of a lookahead window, during which the times of future arrivals and services are revealed. They prove that, while delay under an optimal online policy diverges to infinity in the heavy-traffic regime, it can stay bounded by making use of future information. However, the diversion polices of Spencer et al. (2014) require the length of the lookahead window to grow to infinity at a non-trivial rate in the heavy-traffic regime, and it remained open whether substantial performance improvement could still be achieved with less future information. We resolve this question to a large extent by establishing an asymptotically tight lower bound on how much future information is necessary to achieve superior performance, which matches the upper bound of Spencer et al. (2014) up to a constant multiplicative factor. Our result hence demonstrates that the system's heavy-traffic delay performance is highly sensitive to the amount of future information available. Our proof is based on analyzing certain excursion probabilities of the input sample paths, and exploiting a connection between a policy's diversion decisions and subsequent server idling, which may be of independent interest for related dynamic resource allocation problems.

preprint2014arXiv

Queuing with future information

We study an admissions control problem, where a queue with service rate $1-p$ receives incoming jobs at rate $λ\in(1-p,1)$, and the decision maker is allowed to redirect away jobs up to a rate of $p$, with the objective of minimizing the time-average queue length. We show that the amount of information about the future has a significant impact on system performance, in the heavy-traffic regime. When the future is unknown, the optimal average queue length diverges at rate $\sim\log_{1/(1-p)}\frac{1}{1-λ}$, as $λ\to 1$. In sharp contrast, when all future arrival and service times are revealed beforehand, the optimal average queue length converges to a finite constant, $(1-p)/p$, as $λ\to1$. We further show that the finite limit of $(1-p)/p$ can be achieved using only a finite lookahead window starting from the current time frame, whose length scales as $\mathcal{O}(\log\frac{1}{1-λ})$, as $λ\to1$. This leads to the conjecture of an interesting duality between queuing delay and the amount of information about the future.

preprint2013arXiv

Go Viral, or Not: Rate-Optimal Control for Resource-Constrained Branching Processes

We propose and analyze a new class of controlled multi-type branching processes with a per-step linear resource constraint, motivated by potential applications in viral marketing and cancer treatment. We show that the optimal exponential growth rate of the population can be achieved by maintaining a fixed proportion among the species, for both deterministic and stochastic branching processes. In the special case of a two-type population and with a symmetric reward structure, the optimal proportion is obtained in closed-form. In addition to revealing structural properties of controlled branching processes, our results are intended to provide the practitioners with an easy-to-interpret benchmark for best practices, if not exact policies. As a proof of concept, the methodology is applied to the linkage structure of the 2004 US Presidential Election blogosphere, where the optimal growth rate demonstrates sizable gains over a uniform selection strategy, and to a two-compartment cell-cycle kinetics model for cancer growth, with realistic parameters, where the robust estimate for minimal treatment intensity under a worst-case growth rate is noticeably more conservative compared to that obtained using more optimistic assumptions.

preprint2012arXiv

On the Power of Centralization in Distributed Processing

In this thesis, we propose and analyze a multi-server model that captures a performance trade-off between centralized and distributed processing. In our model, a fraction $p$ of an available resource is deployed in a centralized manner (e.g., to serve a most-loaded station) while the remaining fraction $1-p$ is allocated to local servers that can only serve requests addressed specifically to their respective stations. Using a fluid model approach, we demonstrate a surprising phase transition in the steady-state delay, as $p$ changes: in the limit of a large number of stations, and when any amount of centralization is available ($p>0$), the average queue length in steady state scales as $\log_{1/(1-p)} 1/(1-λ)$ when the traffic intensity $λ$ goes to 1. This is exponentially smaller than the usual M/M/1-queue delay scaling of $1/(1-λ)$, obtained when all resources are fully allocated to local stations ($p=0$). This indicates a strong qualitative impact of even a small degree of centralization. We prove convergence to a fluid limit, and characterize both the transient and steady-state behavior of the finite system, in the limit as the number of stations $N$ goes to infinity. We show that the sequence of queue-length processes converges to a unique fluid trajectory (over any finite time interval, as $N$ approaches infinity, and that this fluid trajectory converges to a unique invariant state $v^I$, for which a simple closed-form expression is obtained. We also show that the steady-state distribution of the $N$-server system concentrates on $v^I$ as $N$ goes to infinity.

Kuang Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Temporal Concatenation for Markov Decision Processes

A Bit Better? Quantifying Information for Bandit Learning

Hierarchical Causal Bandit

Anonymous Stochastic Routing

Experimenting in Equilibrium

Optimal query complexity for private sequential learning against eavesdropping

Private Sequential Learning

On the capacity of information processing systems

Necessity of Future Information in Admission Control

Queuing with future information

Go Viral, or Not: Rate-Optimal Control for Resource-Constrained Branching Processes

On the Power of Centralization in Distributed Processing