Source author record

Sharayu Moharir

Sharayu Moharir appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Systems and Control eess.SY Information Theory math.IT Networking and Internet Architecture Machine Learning Data Structures and Algorithms Social and Information Networks Artificial Intelligence Computer Science and Game Theory Distributed, Parallel, and Cluster Computing math.OC

Catalog footprint

What is connected

16works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

A Tree Markov Decision Problem (T-MDP) is a finite-horizon MDP with a starting state $s_{1}$, in which every state is reachable from $s_{1}$ through exactly one state-action trajectory. T-MDPs arise naturally as abstractions of decision making in sequential games with perfect recall, against stationary opponents. We consider the problem of on-line learning in T-MDPs, both in the PAC and the regret-minimisation regimes. We show that well-known bandit algorithms -- \textsc{Lucb} and \textsc{Ucb} -- can be applied on T-MDPs by treating each policy as an arm. The apparent technical challenge in this approach is that the number of policies is exponential in the number of states. Our main innovation is in the design of confidence bounds based on data shared by the policies, so that the bandit algorithms can yet be implemented with polynomial memory and per-step computation. We obtain instance-dependent upper bounds on sample complexity and regret that sum a ``gap term'' from every terminal state, rather than every policy. Empirically, our algorithms consistently outperform available alternatives on a suite of hidden-information games.

preprint2022arXiv

Greedy k-Center from Noisy Distance Samples

We study a variant of the canonical k-center problem over a set of vertices in a metric space, where the underlying distances are apriori unknown. Instead, we can query an oracle which provides noisy/incomplete estimates of the distance between any pair of vertices. We consider two oracle models: Dimension Sampling where each query to the oracle returns the distance between a pair of points in one dimension; and Noisy Distance Sampling where the oracle returns the true distance corrupted by noise. We propose active algorithms, based on ideas such as UCB, Thompson Sampling and Track-and-Stop developed in the closely related Multi-Armed Bandit problem, which adaptively decide which queries to send to the oracle and are able to solve the k-center problem within an approximation ratio of two with high probability. We analytically characterize instance-dependent query complexity of our algorithms and also demonstrate significant improvements over naive implementations via numerical evaluations on two real-world datasets (Tiny ImageNet and UT Zappos50K).

preprint2022arXiv

Multi-Model Federated Learning

Federated learning is a form of distributed learning with the key challenge being the non-identically distributed nature of the data in the participating clients. In this paper, we extend federated learning to the setting where multiple unrelated models are trained simultaneously. Specifically, every client is able to train any one of M models at a time and the server maintains a model for each of the M models which is typically a suitably averaged version of the model computed by the clients. We propose multiple policies for assigning learning tasks to clients over time. In the first policy, we extend the widely studied FedAvg to multi-model learning by allotting models to clients in an i.i.d. stochastic manner. In addition, we propose two new policies for client selection in a multi-model federated setting which make decisions based on current local losses for each client-model pair. We compare the performance of the policies on tasks involving synthetic and real-world data and characterize the performance of the proposed policies. The key take-away from our work is that the proposed multi-model policies perform better or at least as good as single model training using FedAvg.

preprint2022arXiv

Online Partial Service Hosting at the Edge

We consider the problem of service hosting where an application provider can dynamically rent edge computing resources and serve user requests from the edge to deliver a better quality of service. A key novelty of this work is that we allow the service to be hosted partially at the edge which enables a fraction of the user query to be served by the edge. We model the total cost for (partially) hosting a service at the edge as a combination of the latency in serving requests, the bandwidth consumption, and the time-varying cost for renting edge resources. We propose an online policy called $α$-RetroRenting ($α$-RR) which dynamically determines the fraction of the service to be hosted at the edge in any time-slot, based on the history of the request arrivals and the rent cost sequence. As our main result, we derive an upper bound on $α$-RR's competitive ratio with respect to the offline optimal policy that knows the entire request arrival and rent cost sequence in advance. We conduct extensive numerical evaluations to compare the performance of $α$-RR with various benchmarks for synthetic and trace-based request arrival and rent cost processes, and find several parameter regimes where $α$-RR's ability to store the service partially greatly improves cost-efficiency.

preprint2021arXiv

Decentralized Age-of-Information Bandits

Age-of-Information (AoI) is a performance metric for scheduling systems that measures the freshness of the data available at the intended destination. AoI is formally defined as the time elapsed since the destination received the recent most update from the source. We consider the problem of scheduling to minimize the cumulative AoI in a multi-source multi-channel setting. Our focus is on the setting where channel statistics are unknown and we model the problem as a distributed multi-armed bandit problem. For an appropriately defined AoI regret metric, we provide analytical performance guarantees of an existing UCB-based policy for the distributed multi-armed bandit problem. In addition, we propose a novel policy based on Thomson Sampling and a hybrid policy that tries to balance the trade-off between the aforementioned policies. Further, we develop AoI-aware variants of these policies in which each source takes its current AoI into account while making decisions. We compare the performance of various policies via simulations.

preprint2021arXiv

On Minimizing Channel-Aware Age of Information in a Multi-Sensor Setting

We propose a variant of the Age of Information (AoI) metric called Channel-Aware Age of Information (CA-AoI). Unlike AoI, CA-AoI takes into account the channel conditions between the source and the intended destination to compute the "age" of the recent most update received by the destination. This new metric ensures that the resource allocation is not heavily tilted towards the sources with poor channel conditions. We design scheduling policies for multi-sensor systems in which sensors report their measurements to a central monitoring station via a shared unreliable communication channel with the goal of minimizing the time-average of the weighted sum of CA-AoIs. We initially derive universal lower bounds for the freshness objective. We show that the scheduling problem is indexable and derive low complexity Whittle index based scheduling policies. We also design stationary randomized scheduling algorithms and give optimization procedures to find the optimal parameters of the policy. Via simulations, we show that our proposed policies surpass the greedy policy in several settings. Moreover the Whittle Index based scheduling policies outperform other policies in all the settings considered.

preprint2020arXiv

Coded Caching for Heterogeneous Wireless Networks

This chapter provides an overview of coded caching in the context of heterogeneous wireless networks. We begin by briefly describing the key idea behind coded caching and then discuss in detail the impact of various aspects such as non-uniform content popularity, multiple cache access, and interference.

preprint2020arXiv

Correlated Age-of-Information Bandits

We consider a system composed of a sensor node tracking a time varying quantity. In every discretized time slot, the node attempts to send an update to a central monitoring station through one of K communication channels. We consider the setting where channel realizations are correlated across channels. This is motivated by mmWave based 5G systems where line-of-sight which is critical for successful communication is common across all frequency channels while the effect of other factors like humidity is frequency dependent. The metric of interest is the Age-of-Information (AoI) which is a measure of the freshness of the data available at the monitoring station. In the setting where channel statistics are unknown but stationary across time and correlated across channels, the algorithmic challenge is to determine which channel to use in each time-slot for communication. We model the problem as a Multi-Armed bandit (MAB) with channels as arms. We characterize the fundamental limits on the performance of any policy. In addition, via analysis and simulations, we characterize the performance of variants of the UCB and Thompson Sampling policies that exploit correlation.

preprint2020arXiv

Duopolistic platform competition for revenue and throughput

We consider two competing platforms operating in a two-sided market and offering identical services to their customers at potentially different prices. The objective of each platform is to maximize its throughput or revenue by suitably pricing its services. We assume that customers have a preference or loyalty to the platforms while the workers freelance for the two platforms. Assuming that the resulting interaction between the users is such that their aggregate utility is maximized, we show that for each value of the loyalty, there exists a pure strategy Nash equilibrium for both the throughput and revenue competition game and characterize it.

preprint2020arXiv

Influencing Opinion Dynamics in Networks with Limited Interaction

The focus of this work is on designing influencing strategies to shape the collective opinion of a network of individuals. We consider a variant of the voter model where opinions evolve in one of two ways. In the absence of external influence, opinions evolve via interactions between individuals in the network, while, in the presence of external influence, opinions shift in the direction preferred by the influencer. We focus on a finite time-horizon and an influencing strategy is characterized by when it exerts influence in this time-horizon given its budget constraints. Prior work on this opinion dynamics model assumes that individuals take into account the opinion of all individuals in the network. We generalize this and consider the setting where the opinion evolution of an individual depends on a limited collection of opinions from the network. We characterize the nature of optimal influencing strategies as a function of the way in which this collection of opinions is formed.

preprint2020arXiv

Influencing Opinions of Heterogeneous Populations over Finite Time Horizons

In this work, we focus on strategies to influence the opinion dynamics of a well-connected society. We propose a generalization of the popular voter model. This variant of the voter model can capture a wide range of individuals including strong-willed individuals whose opinion evolution is independent of their neighbors as well as conformist/rebel individuals who tend to adopt the opinion of the majority/minority. Motivated by political campaigns which aim to influence opinion dynamics by the end of a fixed deadline, we focus on influencing strategies for finite time horizons. We characterize the nature of optimal influencing strategies as a function of the nature of individuals forming the society. Using this, we show that for a society consisting of predominantly strong-willed/rebel individuals, the optimal strategy is to influence towards the end of the finite time horizon, whereas, for a society predominantly consisting of conformist individuals who try to adopt the opinion of the majority, it could be optimal to influence in the initial phase of the finite time horizon.

preprint2020arXiv

Regret of Age-of-Information Bandits

We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of K available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. Our analytical results show that UCB and Thompson sampling are order-optimal for AoI bandits. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies.

preprint2020arXiv

Resource Pooling in Large-Scale Content Delivery Systems

Content delivery networks are a key infrastructure component used by Video on Demand (VoD) services to deliver content over the Internet. We study a content delivery system consisting of a central server and multiple co-located caches, each with limited storage and service capabilities. This work evaluates the performance of such a system as a function of the storage capacity of the caches, the content replication strategy, and the service policy. This analysis can be used for a system-level optimization of these design choices. The focus of this work is on understanding the benefits of allowing caches to pool their resources to serve user requests. We show that the benefits of resource pooling depend on the popularity profile of the contents offered by the VoD service. More specifically, if the popularity does not vary drastically across contents, resource pooling can lead to significant improvements in the system performance. In contrast, if the content popularity is uneven, the benefits of resource pooling are negligible.

preprint2016arXiv

Content Replication in Large Distributed Caches

In this paper, we consider the algorithmic task of content replication and request routing in a distributed caching system consisting of a central server and a large number of caches, each with limited storage and service capabilities. We study a time-slotted system where in each time-slot, a large batch of requests has to be matched to a large number of caches, where each request can be served by any cache which stores the requested content. All requests which cannot be served by the caches are served by fetching the requested content from the central server. The goal is to minimize the transmission rate from the central server. We use a novel mapping between our content replication problem and the Knapsack problem to prove a lower bound on the transmission rate for any content replication policy. Using insights obtained from the mapping, we propose a content replication policy - Knapsack Storage - which achieves this lower bound. While it intuitively makes sense to replicate the more popular contents on a larger number of caches, surprisingly, in certain cases, the Knapsack Storage policy chooses not to replicate the most popular contents on the caches at all.

preprint2016arXiv

Paging with Multiple Caches

Modern content delivery networks consist of one or more back-end servers which store the entire content catalog, assisted by multiple front-end servers with limited storage and service capacities located near the end-users. Appropriate replication of content on the front-end servers is key to maximize the fraction of requests served by the front-end servers. Motivated by this, a multiple cache variant of the classical single cache paging problem is studied, which is referred to as the Multiple Cache Paging (MCP) problem. In each time-slot, a batch of content requests arrive that have to be served by a bank of caches, and each cache can serve exactly one request. If a content is not found in the bank, it is fetched from the back-end server, and one currently stored content is ejected, and counted as fault. As in the classical paging problem, the goal is to minimize the total number of faults. The competitive ratio of any online algorithm for the MCP problem is shown to be unbounded for arbitrary input, thus concluding that the MCP problem is fundamentally different from the classical paging problem. Consequently, stochastic arrivals setting is considered, where requests arrive according to a known/unknown stochastic process. It is shown that near optimal performance can be achieved with simple policies that require no co-ordination across the caches.

preprint2014arXiv

Serving Content with Unknown Demand:the High-Dimensional Regime

In this paper we look at content placement in the high-dimensional regime: there are n servers, and O(n) distinct types of content. Each server can store and serve O(1) types at any given time. Demands for these content types arrive, and have to be served in an online fashion; over time, there are a total of O(n) of these demands. We consider the algorithmic task of content placement: determining which types of content should be on which server at any given time, in the setting where the demand statistics (i.e. the relative popularity of each type of content) are not known a-priori, but have to be inferred from the very demands we are trying to satisfy. This is the high- dimensional regime because this scaling (everything being O(n)) prevents consistent estimation of demand statistics; it models many modern settings where large numbers of users, servers and videos/webpages interact in this way. We characterize the performance of any scheme that separates learning and placement (i.e. which use a portion of the demands to gain some estimate of the demand statistics, and then uses the same for the remaining demands), showing it is order-wise strictly suboptimal. We then study a simple adaptive scheme - which myopically attempts to store the most recently requested content on idle servers - and show it outperforms schemes that separate learning and placement. Our results also generalize to the setting where the demand statistics change with time. Overall, our results demonstrate that separating the estimation of demand, and the subsequent use of the same, is strictly suboptimal.

Sharayu Moharir

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

Greedy k-Center from Noisy Distance Samples

Multi-Model Federated Learning

Online Partial Service Hosting at the Edge

Decentralized Age-of-Information Bandits

On Minimizing Channel-Aware Age of Information in a Multi-Sensor Setting

Coded Caching for Heterogeneous Wireless Networks

Correlated Age-of-Information Bandits

Duopolistic platform competition for revenue and throughput

Influencing Opinion Dynamics in Networks with Limited Interaction

Influencing Opinions of Heterogeneous Populations over Finite Time Horizons

Regret of Age-of-Information Bandits

Resource Pooling in Large-Scale Content Delivery Systems

Content Replication in Large Distributed Caches

Paging with Multiple Caches

Serving Content with Unknown Demand:the High-Dimensional Regime