Source author record

R. Srikant

R. Srikant appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Networking and Internet Architecture math.PR Performance Information Theory math.IT math.OC Social and Information Networks Multimedia Systems and Control Computer Science and Game Theory Cryptography and Security Data Structures and Algorithms Discrete Mathematics Distributed, Parallel, and Cluster Computing eess.SY

Catalog footprint

What is connected

38works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Policy Gradient Algorithm for the Risk-Sensitive Exponential Cost MDP

We study the risk-sensitive exponential cost MDP formulation and develop a trajectory-based gradient algorithm to find the stationary point of the cost associated with a set of parameterized policies. We derive a formula that can be used to compute the policy gradient from (state, action, cost) information collected from sample paths of the MDP for each fixed parameterized policy. Unlike the traditional average-cost problem, standard stochastic approximation theory cannot be used to exploit this formula. To address the issue, we introduce a truncated and smooth version of the risk-sensitive cost and show that this new cost criterion can be used to approximate the risk-sensitive cost and its gradient uniformly under some mild assumptions. We then develop a trajectory-based gradient algorithm to minimize the smooth truncated estimation of the risk-sensitive cost and derive conditions under which a sequence of truncations can be used to solve the original, untruncated cost problem.

preprint2022arXiv

Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

Natural actor-critic (NAC) and its variants, equipped with the representation power of neural networks, have demonstrated impressive empirical success in solving Markov decision problems with large state spaces. In this paper, we present a finite-time analysis of NAC with neural network approximation, and identify the roles of neural networks, regularization and optimization techniques (e.g., gradient clipping and averaging) to achieve provably good performance in terms of sample complexity, iteration complexity and overparametrization bounds for the actor and the critic. In particular, we prove that (i) entropy regularization and averaging ensure stability by providing sufficient exploration to avoid near-deterministic and strictly suboptimal policies and (ii) regularization leads to sharp sample complexity and network width bounds in the regularized MDPs, yielding a favorable bias-variance tradeoff in policy optimization. In the process, we identify the importance of uniform approximation power of the actor neural network to achieve global optimality in policy optimization due to distributional shift.

preprint2022arXiv

The Mean-Squared Error of Double Q-Learning

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both tabular setting and with linear function approximation, provided that the optimal policy is unique and the algorithms converge. We show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.

preprint2021arXiv

Heavy-Traffic Insensitive Bounds for Weighted Proportionally Fair Bandwidth Sharing Policies

We consider a connection-level model proposed by Massoulié and Roberts for bandwidth sharing among file transfer flows in a communication network. We study weighted proportionally fair sharing policies and establish explicit-form bounds on the weighted sum of the expected numbers of flows on different routes in heavy traffic. The bounds are linear in the number of critically loaded links in the network, and they hold for a class of phase-type file-size distributions; i.e., the bounds are heavy-traffic insensitive to the distributions in this class. Our approach is Lyapunov-drift based, which is different from the widely used diffusion approximation approach. A key technique we develop is to construct a novel inner product in the state space, which then allows us to obtain a multiplicative type of state-space collapse in steady state. Furthermore, this state-space collapse result implies the interchange of limits as a by-product for the diffusion approximation of the equal-weight case under phase-type file-size distributions, demonstrating the heavy-traffic insensitivity of the stationary distribution.

preprint2021arXiv

Improved Algorithms for Misspecified Linear Markov Decision Processes

For the misspecified linear Markov decision process (MLMDP) model of Jin et al. [2020], we propose an algorithm with three desirable properties. (P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance. (P2) Its space and per-episode time complexities remain bounded as $K \rightarrow \infty$. (P3) It does not require $\varepsilon_{\text{mis}}$ as input. To our knowledge, this is the first algorithm satisfying all three properties. For concrete choices of $\varepsilon_{\text{tol}}$, we also improve existing regret bounds (up to log factors) while achieving either (P2) or (P3) (existing algorithms satisfy neither). At a high level, our algorithm generalizes (to MLMDPs) and refines the Sup-Lin-UCB algorithm, which Takemura et al. [2021] recently showed satisfies (P3) for contextual bandits. We also provide an intuitive interpretation of their result, which informs the design of our algorithm.

preprint2021arXiv

On Concentration Inequalities for Vector-Valued Lipschitz Functions

We derive two upper bounds for the probability of deviation of a vector-valued Lipschitz function of a collection of random variables from its expected value. The resulting upper bounds can be tighter than bounds obtained by a direct application of a classical theorem due to Bobkov and Götze.

preprint2021arXiv

Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure

We consider Markov Decision Processes (MDPs) in which every stationary policy induces the same graph structure for the underlying Markov chain and further, the graph has the following property: if we replace each recurrent class by a node, then the resulting graph is acyclic. For such MDPs, we prove the convergence of the stochastic dynamics associated with a version of optimistic policy iteration (OPI), suggested in Tsitsiklis (2002), in which the values associated with all the nodes visited during each iteration of the OPI are updated.

preprint2021arXiv

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

We propose an algorithm that uses linear function approximation (LFA) for stochastic shortest path (SSP). Under minimal assumptions, it obtains sublinear regret, is computationally efficient, and uses stationary policies. To our knowledge, this is the first such algorithm in the LFA literature (for SSP or other formulations). Our algorithm is a special case of a more general one, which achieves regret square root in the number of episodes given access to a certain computation oracle.

preprint2021arXiv

Robust Multi-Agent Multi-Armed Bandits

Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret. However, these works assume that each agent always recommends their individual best-arm estimates to other agents, which is unrealistic in envisioned applications (machine faults in distributed computing or spam in social recommendation systems). Hence, we generalize the setting to include $n$ honest and $m$ malicious agents who recommend best-arm estimates and arbitrary arms, respectively. We first show that even with a single malicious agent, existing collaboration-based algorithms fail to improve regret guarantees over a single-agent baseline. We propose a scheme where honest agents learn who is malicious and dynamically reduce communication with (i.e., "block") them. We show that collaboration indeed decreases regret for this algorithm, assuming $m$ is small compared to $K$ but without assumptions on malicious agents' behavior, thus ensuring that our algorithm is robust against any malicious recommendation strategy.

preprint2020arXiv

Budget-Constrained Bandits over General Cost and Reward Distributions

We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. The objective is to maximize the total expected reward under a budget constraint on the total cost. The model is general in the sense that it allows correlated and potentially heavy-tailed cost-reward pairs that can take on negative values as required by many applications. We show that if moments of order $(2+γ)$ for some $γ> 0$ exist for all cost-reward pairs, $O(\log B)$ regret is achievable for a budget $B>0$. In order to achieve tight regret bounds, we propose algorithms that exploit the correlation between the cost and reward of each arm by extracting the common information via linear minimum mean-square error estimation. We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret bounds, which are optimal up to a universal constant factor in the case of jointly Gaussian cost and reward pairs.

preprint2020arXiv

Continuous-Time Multi-Armed Bandits with Controlled Restarts

Time-constrained decision processes have been ubiquitous in many fundamental applications in physics, biology and computer science. Recently, restart strategies have gained significant attention for boosting the efficiency of time-constrained processes by expediting the completion times. In this work, we investigate the bandit problem with controlled restarts for time-constrained decision processes, and develop provably good learning algorithms. In particular, we consider a bandit setting where each decision takes a random completion time, and yields a random and correlated reward at the end, with unknown values at the time of decision. The goal of the decision-maker is to maximize the expected total reward subject to a time constraint $τ$. As an additional control, we allow the decision-maker to interrupt an ongoing task and forgo its reward for a potentially more rewarding alternative. For this problem, we develop efficient online learning algorithms with $O(\log(τ))$ and $O(\sqrt{τ\log(τ)})$ regret in a finite and continuous action space of restart strategies, respectively. We demonstrate an applicability of our algorithm by using it to boost the performance of SAT solvers.

preprint2020arXiv

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

We consider the problem of detecting out-of-distribution images in neural networks. We propose ODIN, a simple and effective method that does not require any change to a pre-trained neural network. Our method is based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions between in- and out-of-distribution images, allowing for more effective detection. We show in a series of experiments that ODIN is compatible with diverse network architectures and datasets. It consistently outperforms the baseline approach by a large margin, establishing a new state-of-the-art performance on this task. For example, ODIN reduces the false positive rate from the baseline 34.7% to 4.3% on the DenseNet (applied to CIFAR-10) when the true positive rate is 95%.

preprint2020arXiv

Optimal Load Balancing in Bipartite Graphs

Applications in cloud platforms motivate the study of efficient load balancing under job-server constraints and server heterogeneity. In this paper, we study load balancing on a bipartite graph where left nodes correspond to job types and right nodes correspond to servers, with each edge indicating that a job type can be served by a server. Thus edges represent locality constraints, i.e., each job can only be served at servers which contained certain data and/or machine learning (ML) models. Servers in this system can have heterogeneous service rates. In this setting, we investigate the performance of two policies named Join-the-Fastest-of-the-Shortest-Queue (JFSQ) and Join-the-Fastest-of-the-Idle-Queue (JFIQ), which are simple variants of Join-the-Shortest-Queue and Join-the-Idle-Queue, where ties are broken in favor of the fastest servers. Under a "well-connected" graph condition, we show that JFSQ and JFIQ are asymptotically optimal in the mean response time when the number of servers goes to infinity. In addition to asymptotic optimality, we also obtain upper bounds on the mean response time for finite-size systems. We further show that the well-connectedness condition can be satisfied by a random bipartite graph construction with relatively sparse connectivity.

preprint2019arXiv

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

Traditional landscape analysis of deep neural networks aims to show that no sub-optimal local minima exist in some appropriate sense. From this, one may be tempted to conclude that descent algorithms which escape saddle points will reach a good local minimum. However, basic optimization theory tell us that it is also possible for a descent algorithm to diverge to infinity if there are paths leading to infinity, along which the loss function decreases. It is not clear whether for non-linear neural networks there exists one setting that no bad local-min and no decreasing paths to infinity can be simultaneously achieved. In this paper, we give the first positive answer to this question. More specifically, for a large class of over-parameterized deep neural networks with appropriate regularizers, the loss function has no bad local minima and no decreasing paths to infinity. The key mathematical trick is to show that the set of regularizers which may be undesirable can be viewed as the image of a Lipschitz continuous mapping from a lower-dimensional Euclidean space to a higher-dimensional Euclidean space, and thus has zero measure.

preprint2016arXiv

Distributed Learning Algorithms for Spectrum Sharing in Spatial Random Access Wireless Networks

We consider distributed optimization over orthogonal collision channels in spatial random access networks. Users are spatially distributed and each user is in the interference range of a few other users. Each user is allowed to transmit over a subset of the shared channels with a certain attempt probability. We study both the non-cooperative and cooperative settings. In the former, the goal of each user is to maximize its own rate irrespective of the utilities of other users. In the latter, the goal is to achieve proportionally fair rates among users. Simple distributed learning algorithms are developed to solve these problems. The efficiencies of the proposed algorithms are demonstrated via both theoretical analysis and simulation results.

preprint2016arXiv

Mixing Times and Structural Inference for Bernoulli Autoregressive Processes

We introduce a novel multivariate random process producing Bernoulli outputs per dimension, that can possibly formalize binary interactions in various graphical structures and can be used to model opinion dynamics, epidemics, financial and biological time series data, etc. We call this a Bernoulli Autoregressive Process (BAR). A BAR process models a discrete-time vector random sequence of $p$ scalar Bernoulli processes with autoregressive dynamics and corresponds to a particular Markov Chain. The benefit from the autoregressive dynamics is the description of a $2^p\times 2^p$ transition matrix by at most $pd$ effective parameters for some $d\ll p$ or by two sparse matrices of dimensions $p\times p^2$ and $p\times p$, respectively, parameterizing the transitions. Additionally, we show that the BAR process mixes rapidly, by proving that the mixing time is $O(\log p)$. The hidden constant in the previous mixing time bound depends explicitly on the values of the chain parameters and implicitly on the maximum allowed in-degree of a node in the corresponding graph. For a network with $p$ nodes, where each node has in-degree at most $d$ and corresponds to a scalar Bernoulli process generated by a BAR, we provide a greedy algorithm that can efficiently learn the structure of the underlying directed graph with a sample complexity proportional to the mixing time of the BAR process. The sample complexity of the proposed algorithm is nearly order-optimal as it is only a $\log p$ factor away from an information-theoretic lower bound. We present simulation results illustrating the performance of our algorithm in various setups, including a model for a biological signaling network.

preprint2016arXiv

On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression

The problem of least squares regression of a $d$-dimensional unknown parameter is considered. A stochastic gradient descent based algorithm with weighted iterate-averaging that uses a single pass over the data is studied and its convergence rate is analyzed. We first consider a bounded constraint set of the unknown parameter. Under some standard regularity assumptions, we provide an explicit $O(1/k)$ upper bound on the convergence rate, depending on the variance (due to the additive noise in the measurements) and the size of the constraint set. We show that the variance term dominates the error and decreases with rate $1/k$, while the term which is related to the size of the constraint set decreases with rate $\log k/k^2$. We then compare the asymptotic ratio $ρ$ between the convergence rate of the proposed scheme and the empirical risk minimizer (ERM) as the number of iterations approaches infinity. We show that $ρ\leq 4$ under some mild conditions for all $d\geq 1$. We further improve the upper bound by showing that $ρ\leq 4/3$ for the case of $d=1$ and unbounded parameter set. Simulation results demonstrate strong performance of the algorithm as compared to existing methods, and coincide with $ρ\leq 4/3$ even for large $d$ in practice.

preprint2016arXiv

Optimal Heavy-Traffic Queue Length Scaling in an Incompletely Saturated Switch

We consider an input queued switch operating under the MaxWeight scheduling algorithm. This system is interesting to study because it is a model for Internet routers and data center networks. Recently, it was shown that the MaxWeight algorithm has optimal heavy-traffic queue length scaling when all ports are uniformly saturated. Here we consider the case when an arbitrary number of ports are saturated (which we call the incompletely saturated case), and each port is allowed to saturate at a different rate. We use a recently developed drift technique to show that the heavy-traffic queue length under the MaxWeight scheduling algorithm has optimal scaling with respect to the switch size even in these cases.

preprint2015arXiv

Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits

We study contextual bandits with budget and time constraints, referred to as constrained contextual bandits.The time and budget constraints significantly complicate the exploration and exploitation tradeoff because they introduce complex coupling among contexts over time.Such coupling effects make it difficult to obtain oracle solutions that assume known statistics of bandits. To gain insight, we first study unit-cost systems with known context distribution. When the expected rewards are known, we develop an approximation of the oracle, referred to Adaptive-Linear-Programming (ALP), which achieves near-optimality and only requires the ordering of expected rewards. With these highly desirable features, we then combine ALP with the upper-confidence-bound (UCB) method in the general case where the expected rewards are unknown {\it a priori}. We show that the proposed UCB-ALP algorithm achieves logarithmic regret except for certain boundary cases. Further, we design algorithms and obtain similar regret analysis results for more general systems with unknown context distribution and heterogeneous costs. To the best of our knowledge, this is the first work that shows how to achieve logarithmic regret in constrained contextual bandits. Moreover, this work also sheds light on the study of computationally efficient algorithms for general constrained contextual bandits.

preprint2015arXiv

Clustering and Inference From Pairwise Comparisons

Given a set of pairwise comparisons, the classical ranking problem computes a single ranking that best represents the preferences of all users. In this paper, we study the problem of inferring individual preferences, arising in the context of making personalized recommendations. In particular, we assume that there are $n$ users of $r$ types; users of the same type provide similar pairwise comparisons for $m$ items according to the Bradley-Terry model. We propose an efficient algorithm that accurately estimates the individual preferences for almost all users, if there are $r \max \{m, n\}\log m \log^2 n$ pairwise comparisons per type, which is near optimal in sample complexity when $r$ only grows logarithmically with $m$ or $n$. Our algorithm has three steps: first, for each user, compute the \emph{net-win} vector which is a projection of its $\binom{m}{2}$-dimensional vector of pairwise comparisons onto an $m$-dimensional linear subspace; second, cluster the users based on the net-win vectors; third, estimate a single preference for each cluster separately. The net-win vectors are much less noisy than the high dimensional vectors of pairwise comparisons and clustering is more accurate after the projection as confirmed by numerical experiments. Moreover, we show that, when a cluster is only approximately correct, the maximum likelihood estimation for the Bradley-Terry model is still close to the true preference.

preprint2015arXiv

Queue Length Behavior in a Switch under the MaxWeight Algorithm

We consider a switch operating under the MaxWeight scheduling algorithm, under any traffic pattern such that all the ports are loaded. This system is interesting to study since the queue lengths exhibit a multi-dimensional state-space collapse in the heavy-traffic regime. We use a Lyapunov-type drift technique to characterize the heavy-traffic behavior of the expectation of the sum queue lengths in steady-state, under the assumption that all ports are saturated and all queues receive non-zero traffic. Under these conditions, we show that the heavy-traffic scaled queue length is given by $\left(1-\frac{1}{2n}\right)||σ||^2$, where $σ$ is the vector of the standard deviations of arrivals to each port in the heavy-traffic limit. In the special case of uniform Bernoulli arrivals, the corresponding formula is given by $\left(n-\frac{3}{2}+\frac{1}{2n}\right)$. The result shows that the heavy-traffic scaled queue length has optimal scaling with respect to $n,$ thus settling one version of an open conjecture; in fact, it is shown that the heavy-traffic queue length is at most within a factor of two from the optimal. We then consider certain asymptotic regimes where the load of the system scales simultaneously with the number of ports. We show that the MaxWeight algorithm has optimal queue length scaling behavior provided that the arrival rate approaches capacity sufficiently fast.

preprint2014arXiv

Collaborative Filtering with Information-Rich and Information-Sparse Entities

In this paper, we consider a popular model for collaborative filtering in recommender systems where some users of a website rate some items, such as movies, and the goal is to recover the ratings of some or all of the unrated items of each user. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with $ω(MK \log M)$ noisy entries while $MK$ entries are necessary, where $K$ is the number of clusters and $M$ is the number of items. In the case of co-clustering, we prove that $K^2$ entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when $K$ is sufficiently large. We compare our algorithms with a well-known algorithms called alternating minimization (AM), and a similarity score-based algorithm known as the popularity-among-friends (PAF) algorithm by applying all three to the MovieLens and Netflix data sets. Our co-clustering algorithm and AM have similar overall error rates when recovering the rating matrix, both of which are lower than the error rate under PAF. But more importantly, the error rate of our co-clustering algorithm is significantly lower than AM and PAF in the scenarios of interest in recommender systems: when recommending a few items to each user or when recommending items to users who only rated a few items (these users are the majority of the total user population). The performance difference increases even more when noise is added to the datasets.

preprint2014arXiv

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we propose three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade-offs: one can gradually reduce the computational complexity when increasingly more observations are available.

preprint2014arXiv

Learning Loosely Connected Markov Random Fields

We consider the structure learning problem for graphical models that we call loosely connected Markov random fields, in which the number of short paths between any pair of nodes is small, and present a new conditional independence test based algorithm for learning the underlying graph structure. The novel maximization step in our algorithm ensures that the true edges are detected correctly even when there are short cycles in the graph. The number of samples required by our algorithm is C*log p, where p is the size of the graph and the constant C depends on the parameters of the model. We show that several previously studied models are examples of loosely connected Markov random fields, and our algorithm achieves the same or lower computational complexity than the previously designed algorithms for individual cases. We also get new results for more general graphical models, in particular, our algorithm learns general Ising models on the Erdos-Renyi random graph G(p, c/p) correctly with running time O(np^5).

preprint2013arXiv

Achieving the Optimal Steaming Capacity and Delay Using Random Regular Digraphs in P2P Networks

In earlier work, we showed that it is possible to achieve $O(\log N)$ streaming delay with high probability in a peer-to-peer network, where each peer has as little as four neighbors, while achieving any arbitrary fraction of the maximum possible streaming rate. However, the constant in the $O(log N)$ delay term becomes rather large as we get closer to the maximum streaming rate. In this paper, we design an alternative pairing and chunk dissemination algorithm that allows us to transmit at the maximum streaming rate while ensuring that all, but a negligible fraction of the peers, receive the data stream with $O(\log N)$ delay with high probability. The result is established by examining the properties of graph formed by the union of two or more random 1-regular digraphs, i.e., directed graphs in which each node has an incoming and an outgoing node degree both equal to one.

preprint2013arXiv

Asymptotically Tight Steady-State Queue Length Bounds Implied By Drift Conditions

The Foster-Lyapunov theorem and its variants serve as the primary tools for studying the stability of queueing systems. In addition, it is well known that setting the drift of the Lyapunov function equal to zero in steady-state provides bounds on the expected queue lengths. However, such bounds are often very loose due to the fact that they fail to capture resource pooling effects. The main contribution of this paper is to show that the approach of "setting the drift of a Lyapunov function equal to zero" can be used to obtain bounds on the steady-state queue lengths which are tight in the heavy-traffic limit. The key is to establish an appropriate notion of state-space collapse in terms of steady-state moments of weighted queue length differences, and use this state-space collapse result when setting the Lyapunov drift equal to zero. As an application of the methodology, we prove the steady-state equivalent of the heavy-traffic optimality result of Stolyar for wireless networks operating under the MaxWeight scheduling policy.

preprint2013arXiv

Real-Time Peer-to-Peer Streaming Over Multiple Random Hamiltonian Cycles

We are motivated by the problem of designing a simple distributed algorithm for Peer-to-Peer streaming applications that can achieve high throughput and low delay, while allowing the neighbor set maintained by each peer to be small. While previous works have mostly used tree structures, our algorithm constructs multiple random directed Hamiltonian cycles and disseminates content over the superposed graph of the cycles. We show that it is possible to achieve the maximum streaming capacity even when each peer only transmits to and receives from Theta(1) neighbors. Further, we show that the proposed algorithm achieves the streaming delay of Theta(log N) when the streaming rate is less than (1-1/K) of the maximum capacity for any fixed integer K>1, where N denotes the number of peers in the network. The key theoretical contribution is to characterize the distance between peers in a graph formed by the superposition of directed random Hamiltonian cycles, in which edges from one of the cycles may be dropped at random. We use Doob martingales and graph expansion ideas to characterize this distance as a function of N, with high probability.

preprint2012arXiv

Flow-Level Stability of Wireless Networks: Separation of Congestion Control and Packet Scheduling

It is by now well-known that wireless networks with file arrivals and departures are stable if one uses alpha-fair congestion control and back-pressure based scheduling and routing. In this paper, we examine whether ?alpha-fair congestion control is necessary for flow-level stability. We show that stability can be ensured even with very simple congestion control mechanisms, such as a fixed window size scheme which limits the maximum number of packets that are allowed into the ingress queue of a flow. A key ingredient of our result is the use of the difference between the logarithms of queue lengths as the link weights. This result is reminiscent of results in the context of CSMA algorithms, but for entirely different reasons.

preprint2012arXiv

Online Advertisement, Optimization and Stochastic Networks

In this paper, we propose a stochastic model to describe how search service providers charge client companies based on users' queries for the keywords related to these companies' ads by using certain advertisement assignment strategies. We formulate an optimization problem to maximize the long-term average revenue for the service provider under each client's long-term average budget constraint, and design an online algorithm which captures the stochastic properties of users' queries and click-through behaviors. We solve the optimization problem by making connections to scheduling problems in wireless networks, queueing theory and stochastic networks. Unlike prior models, we do not assume that the number of query arrivals is known. Due to the stochastic nature of the arrival process considered here, either temporary "free" service, i.e., service above the specified budget or under-utilization of the budget is unavoidable. We prove that our online algorithm can achieve a revenue that is within $O(ε)$ of the optimal revenue while ensuring that the overdraft or underdraft is $O(1/ε)$, where $ε$ can be arbitrarily small. With a view towards practice, we can show that one can always operate strictly under the budget. In addition, we extend our results to a click-through rate maximization model, and also show how our algorithm can be modified to handle non-stationary query arrival processes and clients with short-term contracts. Our algorithm allows us to quantify the effect of errors in click-through rate estimation on the achieved revenue. We also show that in the long run, an expected overdraft level of $Ω(\log(1/ε))$ is unavoidable (a universal lower bound) under any stationary ad assignment algorithm which achieves a long-term average revenue within $O(ε)$ of the offline optimum.

preprint2012arXiv

Opinion Dynamics in Social Networks: A Local Interaction Game with Stubborn Agents

The process by which new ideas, innovations, and behaviors spread through a large social network can be thought of as a networked interaction game: Each agent obtains information from certain number of agents in his friendship neighborhood, and adapts his idea or behavior to increase his benefit. In this paper, we are interested in how opinions, about a certain topic, form in social networks. We model opinions as continuous scalars ranging from 0 to 1 with 1(0) representing extremely positive(negative) opinion. Each agent has an initial opinion and incurs some cost depending on the opinions of his neighbors, his initial opinion, and his stubbornness about his initial opinion. Agents iteratively update their opinions based on their own initial opinions and observing the opinions of their neighbors. The iterative update of an agent can be viewed as a myopic cost-minimization response (i.e., the so-called best response) to the others' actions. We study whether an equilibrium can emerge as a result of such local interactions and how such equilibrium possibly depends on the network structure, initial opinions of the agents, and the location of stubborn agents and the extent of their stubbornness. We also study the convergence speed to such equilibrium and characterize the convergence time as a function of aforementioned factors. We also discuss the implications of such results in a few well-known graphs such as Erdos-Renyi random graphs and small-world graphs.

preprint2012arXiv

Towards a Theory of Anonymous Networking

The problem of anonymous networking when an eavesdropper observes packet timings in a communication network is considered. The goal is to hide the identities of source-destination nodes, and paths of information flow in the network. One way to achieve such an anonymity is to use mixers. Mixers are nodes that receive packets from multiple sources and change the timing of packets, by mixing packets at the output links, to prevent the eavesdropper from finding sources of outgoing packets. In this paper, we consider two simple but fundamental scenarios: double input-single output mixer and double input-double output mixer. For the first case, we use the information-theoretic definition of the anonymity, based on average entropy per packet, and find an optimal mixing strategy under a strict latency constraint. For the second case, perfect anonymity is considered, and maximal throughput strategies with perfect anonymity are found under a strict latency constraint and an average queue length constraint.

preprint2011arXiv

Parametrized Stochastic Multi-armed Bandits with Binary Rewards

In this paper, we consider the problem of multi-armed bandits with a large, possibly infinite number of correlated arms. We assume that the arms have Bernoulli distributed rewards, independent across time, where the probabilities of success are parametrized by known attribute vectors for each arm, as well as an unknown preference vector, each of dimension $n$. For this model, we seek an algorithm with a total regret that is sub-linear in time and independent of the number of arms. We present such an algorithm, which we call the Two-Phase Algorithm, and analyze its performance. We show upper bounds on the total regret which applies uniformly in time, for both the finite and infinite arm cases. The asymptotics of the finite arm bound show that for any $f \in ω(\log(T))$, the total regret can be made to be $O(n \cdot f(T))$. In the infinite arm case, the total regret is $O(\sqrt{n^3 T})$.

preprint2010arXiv

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based adaptive routing algorithms where each packet is routed along a possibly different path have been extensively studied in the literature. However, such algorithms typically result in poor delay performance and involve high implementation complexity. In this paper, we develop a new adaptive routing algorithm built upon the widely-studied back-pressure algorithm. We decouple the routing and scheduling components of the algorithm by designing a probabilistic routing table which is used to route packets to per-destination queues. The scheduling decisions in the case of wireless networks are made using counters called shadow queues. The results are also extended to the case of networks which employ simple forms of network coding. In that case, our algorithm provides a low-complexity solution to optimally exploit the routing-coding tradeoff.

preprint2010arXiv

Mixing Time of Glauber Dynamics With Parallel Updates and Heterogeneous Fugacities

Glauber dynamics is a powerful tool to generate randomized, approximate solutions to combinatorially difficult problems. Applications include Markov Chain Monte Carlo (MCMC) simulation and distributed scheduling for wireless networks. In this paper, we derive bounds on the mixing time of a generalization of Glauber dynamics where multiple vertices are allowed to update their states in parallel and the fugacity of each vertex can be different. The results can be used to obtain various conditions on the system parameters such as fugacities, vertex degrees and update probabilities, under which the mixing time grows polynomially in the number of vertices.

preprint2010arXiv

Novel Architectures and Algorithms for Delay Reduction in Back-pressure Scheduling and Routing

The back-pressure algorithm is a well-known throughput-optimal algorithm. However, its delay performance may be quite poor even when the traffic load is not close to network capacity due to the following two reasons. First, each node has to maintain a separate queue for each commodity in the network, and only one queue is served at a time. Second, the back-pressure routing algorithm may route some packets along very long routes. In this paper, we present solutions to address both of the above issues, and hence, improve the delay performance of the back-pressure algorithm. One of the suggested solutions also decreases the complexity of the queueing data structures to be maintained at each node.

preprint2010arXiv

On the Design of Efficient CSMA Algorithms for Wireless Networks

Recently, it has been shown that CSMA algorithms which use queue length-based link weights can achieve throughput optimality in wireless networks. In particular, a key result by Rajagopalan, Shah, and Shin (2009) shows that, if the link weights are chosen to be of the form log(log(q)) (where q is the queue-length), then throughput optimality is achieved. In this paper, we tighten their result by showing that throughput optimality is preserved even with weight functions of the form log(q)/g(q), where g(q) can be a function that increases arbitrarily slowly. The significance of the result is due to the fact that weight functions of the form log(q)/g(q) seem to achieve the best delay performance in practice.

preprint2010arXiv

Scheduling for Optimal Rate Allocation in Ad Hoc Networks With Heterogeneous Delay Constraints

This paper studies the problem of scheduling in single-hop wireless networks with real-time traffic, where every packet arrival has an associated deadline and a minimum fraction of packets must be transmitted before the end of the deadline. Using optimization and stochastic network theory we propose a framework to model the quality of service (QoS) requirements under delay constraints. The model allows for fairly general arrival models with heterogeneous constraints. The framework results in an optimal scheduling algorithm which fairly allocates data rates to all flows while meeting long-term delay demands. We also prove that under a simplified scenario our solution translates into a greedy strategy that makes optimal decisions with low complexity.

preprint2010arXiv

Throughput-Optimal Opportunistic Scheduling in the Presence of Flow-Level Dynamics

We consider multiuser scheduling in wireless networks with channel variations and flow-level dynamics. Recently, it has been shown that the MaxWeight algorithm, which is throughput-optimal in networks with a fixed number users, fails to achieve the maximum throughput in the presence of flow-level dynamics. In this paper, we propose a new algorithm, called workload-based scheduling with learning, which is provably throughput-optimal, requires no prior knowledge of channels and user demands, and performs significantly better than previously suggested algorithms.

R. Srikant

What is connected

Connect this record

See the researcher in context

Building this map preview

38 published item(s)

A Policy Gradient Algorithm for the Risk-Sensitive Exponential Cost MDP

Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

The Mean-Squared Error of Double Q-Learning

Heavy-Traffic Insensitive Bounds for Weighted Proportionally Fair Bandwidth Sharing Policies

Improved Algorithms for Misspecified Linear Markov Decision Processes

On Concentration Inequalities for Vector-Valued Lipschitz Functions

Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

Robust Multi-Agent Multi-Armed Bandits

Budget-Constrained Bandits over General Cost and Reward Distributions

Continuous-Time Multi-Armed Bandits with Controlled Restarts

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

Optimal Load Balancing in Bipartite Graphs

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

Distributed Learning Algorithms for Spectrum Sharing in Spatial Random Access Wireless Networks

Mixing Times and Structural Inference for Bernoulli Autoregressive Processes

On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression

Optimal Heavy-Traffic Queue Length Scaling in an Incompletely Saturated Switch

Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits

Clustering and Inference From Pairwise Comparisons

Queue Length Behavior in a Switch under the MaxWeight Algorithm

Collaborative Filtering with Information-Rich and Information-Sparse Entities

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

Learning Loosely Connected Markov Random Fields

Achieving the Optimal Steaming Capacity and Delay Using Random Regular Digraphs in P2P Networks

Asymptotically Tight Steady-State Queue Length Bounds Implied By Drift Conditions

Real-Time Peer-to-Peer Streaming Over Multiple Random Hamiltonian Cycles

Flow-Level Stability of Wireless Networks: Separation of Congestion Control and Packet Scheduling

Online Advertisement, Optimization and Stochastic Networks

Opinion Dynamics in Social Networks: A Local Interaction Game with Stubborn Agents

Towards a Theory of Anonymous Networking

Parametrized Stochastic Multi-armed Bandits with Binary Rewards

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Mixing Time of Glauber Dynamics With Parallel Updates and Heterogeneous Fugacities

Novel Architectures and Algorithms for Delay Reduction in Back-pressure Scheduling and Routing

On the Design of Efficient CSMA Algorithms for Wireless Networks

Scheduling for Optimal Rate Allocation in Ad Hoc Networks With Heterogeneous Delay Constraints

Throughput-Optimal Opportunistic Scheduling in the Presence of Flow-Level Dynamics