Source author record

Debankur Mukherjee

Debankur Mukherjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Performance Data Structures and Algorithms Machine Learning math.OC Networking and Internet Architecture

Catalog footprint

What is connected

11works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Higher-Order Approximations of Sojourn Times in M/G/1 Queues via Stein's Method

We study the stationary sojourn time distribution in an M/G/1 queue operating under heavy traffic. It is known that the sojourn time converges to an exponential distribution in the limit. Our focus is on obtaining pre-asymptotic, higher-order approximations that go beyond the classical exponential limit. Using Stein's method, we develop an approach based on higher-order expansions of the generator of the underlying Markov process. The key technical step is to represent higher-order derivatives in terms of lower-order ones and control the resulting error via derivative bounds of the Stein equation. Under suitable moment-matching conditions on the service distribution, we show that the approximation error decays as a high-order power of the slack parameter $\varepsilon=1-ρ$. Error bounds are established in the Zolotarev metric, which further imply bounds on the Wasserstein distance as well as the moments. Our results demonstrate that the accuracy of the exponential approximation can be systematically improved by matching progressively more moments of the service distribution.

preprint2026arXiv

SCaLE: Switching Cost aware Learning and Exploration

This work addresses the fundamental problem of unbounded metric movement costs in bandit online convex optimization, by considering high-dimensional dynamic quadratic hitting costs and $\ell_2$-norm switching costs in a noisy bandit feedback model. For a general class of stochastic environments, we provide the first algorithm SCaLE that provably achieves a distribution-agnostic sub-linear dynamic regret, without the knowledge of hitting cost structure. En-route, we present a novel spectral regret analysis that separately quantifies eigenvalue-error driven regret and eigenbasis-perturbation driven regret. Extensive numerical experiments, against online-learning baselines, corroborate our claims, and highlight statistical consistency of our algorithm.

preprint2023arXiv

Mean-field Analysis for Load Balancing on Spatial Graphs

The analysis of large-scale, parallel-server load balancing systems has relied heavily on mean-field analysis. A pivotal assumption for this framework is that the servers are exchangeable. However, modern data-centers have data locality constraints, where tasks of a particular type can only be routed to a small subset of servers. An emerging line of research, therefore, considers load balancing algorithms on bipartite graphs where vertices in the two partitions represent the task types and servers, respectively, and an edge represents the server's ability to process the corresponding task type. Due to the lack of exchangeability in this model, the mean-field techniques fundamentally break down. Recent progress has been made by considering graphs with strong edge-expansion properties, i.e., where any two large subsets of vertices are well-connected. However, data locality often leads to spatial constraints, where edges are local. As a result, these bipartite graphs do not have strong expansion properties. In this paper, we consider the power-of-$d$ choices algorithm and develop a novel coupling-based approach to establish mean-field approximation for a large class of graphs that includes spatial graphs. As a corollary, we also show that, as the system size becomes large, the steady-state occupancy process for arbitrary regular bipartite graphs with diverging degrees, is indistinguishable from a fully flexible system on a complete bipartite graph. The method extends the scope of mean-field analysis far beyond the classical full-flexibility setup. En route, we prove that, starting from suitable states, the occupancy process becomes close to its steady state in a time that is independent of $N$. Such a large-scale mixing-time result might be of independent interest. Numerical experiments are conducted, which positively support the theoretical results.

preprint2022arXiv

Online Capacity Scaling Augmented With Unreliable Machine Learning Predictions

Modern data centers suffer from immense power consumption. As a result, data center operators have heavily invested in capacity scaling solutions, which dynamically deactivate servers if the demand is low and activate them again when the workload increases. We analyze a continuous-time model for capacity scaling, where the goal is to minimize the weighted sum of flow-time, switching cost, and power consumption in an online fashion. We propose a novel algorithm, called Adaptive Balanced Capacity Scaling (ABCS), that has access to black-box machine learning predictions. ABCS aims to adapt to the predictions and is also robust against unpredictable surges in the workload. In particular, we prove that ABCS is $(1+\varepsilon)$-competitive if the predictions are accurate, and yet, it has a uniformly bounded competitive ratio even if the predictions are completely inaccurate. Finally, we investigate the performance of this algorithm on a real-world dataset and carry out extensive numerical experiments, which positively support the theoretical results.

preprint2021arXiv

Scalable load balancing in networked systems: A survey of recent advances

The basic load balancing scenario involves a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues. We discuss recent advances on scalable load balancing schemes which provide favorable delay performance when $N$ grows large, and yet only require minimal implementation overhead. Join-the-Shortest-Queue (JSQ) yields vanishing delays as $N$ grows large, as in a centralized queueing arrangement, but involves a prohibitive communication burden. In contrast, power-of-$d$ or JSQ($d$) schemes that assign an incoming task to a server with the shortest queue among $d$ servers selected uniformly at random require little communication, but lead to constant delays. In order to examine this fundamental trade-off between delay performance and implementation overhead, we consider JSQ($d(N)$) schemes where the diversity parameter $d(N)$ depends on $N$ and investigate what growth rate of $d(N)$ is required to asymptotically match the optimal JSQ performance on fluid and diffusion scale. Stochastic coupling techniques and stochastic-process limits play an instrumental role in establishing the asymptotic optimality. We demonstrate how this methodology carries over to infinite-server settings, finite buffers, multiple dispatchers, servers arranged on graph topologies, and token-based load balancing including the popular Join-the-Idle-Queue (JIQ) scheme. In this way we provide a broad overview of the many recent advances in the field. This survey extends the short review presented at ICM 2018 (arXiv:1712.08555).

preprint2020arXiv

Load Balancing Under Strict Compatibility Constraints

We study large-scale systems operating under the JSQ$(d)$ policy in the presence of stringent task-server compatibility constraints. Consider a system with $N$ identical single-server queues and $M(N)$ task types, where each server is able to process only a small subset of possible task types. Each arriving task selects $d\geq 2$ random servers compatible to its type, and joins the shortest queue among them. The compatibility constraint is naturally captured by a fixed bipartite graph $G_N$ between the servers and the task types. When $G_N$ is complete bipartite, the meanfield approximation is proven to be accurate. However, such dense compatibility graphs are infeasible due to their overwhelming implementation cost and prohibitive storage capacity requirement at the servers. Our goal in this paper is to characterize the class of sparse compatibility graphs for which the meanfield approximation remains valid. To achieve this, first, we introduce a novel graph expansion-based notion, called proportional sparsity, and establish that systems with proportionally sparse compatibility graphs match the performance of a fully flexible system, asymptotically in the large-system limit. Furthermore, for any $c(N)$ satisfying $$\frac{Nc(N)}{M(N)\ln(N)}\to \infty\quad \text{and}\quad c(N)\to \infty,$$ as $N\to\infty$, we show that proportionally sparse random compatibility graphs can be designed, so that the degree of each server is at most $c(N)$. This reduces the server-degree almost by a factor $N/\ln(N)$, compared to the complete bipartite compatibility graph, while maintaining the same asymptotic performance. Extensive simulation experiments are conducted to corroborate the theoretical results.

preprint2019arXiv

Join-the-Shortest Queue Diffusion Limit in Halfin-Whitt Regime: Sensitivity on the Heavy-traffic Parameter

Consider a system of $N$ parallel single-server queues with unit-exponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. When a task arrives, the dispatcher assigns it to one of the servers according to the Join-the-Shortest Queue (JSQ) policy. Eschenfeldt and Gamarnik (Math. Oper. Res., 43(3):867-886, 2018) identified a novel limiting diffusion process that arises as the weak-limit of the appropriately scaled occupancy measure of the system under the JSQ policy in the Halfin-Whitt regime, where $(N - λ(N)) / \sqrt{N} \to β> 0$ as $N \to \infty$. The analysis of this diffusion goes beyond the state of the art techniques, and even proving its ergodicity is non-trivial, and was left as an open question. Recently, exploiting a generator expansion framework via the Stein's method, Braverman (arXiv:1801.05121, 2018) established its exponential ergodicity, and adapting a regenerative approach, Banerjee and Mukherjee (Ann. Appl. Probab., 29(2):1262-1309, 2018) analyzed the tail properties of the stationary distribution and path fluctuations of the diffusion. However, the analysis of the bulk behavior of the stationary distribution, viz., the moments, remained intractable until this work. In this paper, we perform a thorough analysis of the bulk behavior of the stationary distribution of the diffusion process, and discover that it exhibits different qualitative behavior, depending on the value of the heavy-traffic parameter $β$. Moreover, we obtain precise asymptotic laws of the centered and scaled steady state distribution, as $β$ tends to 0 and $\infty$. Of particular interest, we also establish a certain intermittency phenomena in the $β\to \infty$ regime and a surprising distributional convergence result in the $β\to 0$ regime.

preprint2018arXiv

Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

We consider the model of a token-based joint auto-scaling and load balancing strategy, proposed in a recent paper by Mukherjee, Dhara, Borst, and van Leeuwaarden (SIGMETRICS '17, arXiv:1703.08373), which offers an efficient scalable implementation and yet achieves asymptotically optimal steady-state delay performance and energy consumption as the number of servers $N\to\infty$. In the above work, the asymptotic results are obtained under the assumption that the queues have fixed-size finite buffers, and therefore the fundamental question of stability of the proposed scheme with infinite buffers was left open. In this paper, we address this fundamental stability question. The system stability under the usual subcritical load assumption is not automatic. Moreover, the stability may not even hold for all $N$. The key challenge stems from the fact that the process lacks monotonicity, which has been the powerful primary tool for establishing stability in load balancing models. We develop a novel method to prove that the subcritically loaded system is stable for large enough $N$, and establish convergence of steady-state distributions to the optimal one, as $N \to \infty$. The method goes beyond the state of the art techniques -- it uses an induction-based idea and a "weak monotonicity" property of the model; this technique is of independent interest and may have broader applicability.

preprint2018arXiv

Supermarket Model on Graphs

We consider a variation of the supermarket model in which the servers can communicate with their neighbors and where the neighborhood relationships are described in terms of a suitable graph. Tasks with unit-exponential service time distributions arrive at each vertex as independent Poisson processes with rate $λ$, and each task is irrevocably assigned to the shortest queue among the one it first appears and its $d-1$ randomly selected neighbors. This model has been extensively studied when the underlying graph is a clique in which case it reduces to the well known power-of-$d$ scheme. In particular, results of Mitzenmacher (1996) and Vvedenskaya et al. (1996) show that as the size of the clique gets large, the occupancy process associated with the queue-lengths at the various servers converges to a deterministic limit described by an infinite system of ordinary differential equations (ODE). In this work, we consider settings where the underlying graph need not be a clique and is allowed to be suitably sparse. We show that if the minimum degree approaches infinity (however slowly) as the number of servers $N$ approaches infinity, and the ratio between the maximum degree and the minimum degree in each connected component approaches 1 uniformly, the occupancy process converges to the same system of ODE as the classical supermarket model. In particular, the asymptotic behavior of the occupancy process is insensitive to the precise network topology. We also study the case where the graph sequence is random, with the $N$-th graph given as an Erdős-Rényi random graph on $N$ vertices with average degree $c(N)$. Annealed convergence of the occupancy process to the same deterministic limit is established under the condition $c(N)\to\infty$, and under a stronger condition $c(N)/\ln N\to\infty$, convergence (in probability) is shown for almost every realization of the random graph.

preprint2016arXiv

Asymptotic Optimality of Power-of-$d$ Load Balancing in Large-Scale Systems

We consider a system of $N$ identical server pools and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. Arriving tasks cannot be queued, and must immediately be assigned to one of the server pools to start execution, or discarded. The execution times are assumed to be exponentially distributed with unit mean, and do not depend on the number of other tasks receiving service. However, the experienced performance (e.g. in terms of received throughput) does degrade with an increasing number of concurrent tasks at the same server pool. The dispatcher therefore aims to evenly distribute the tasks across the various server pools. Specifically, when a task arrives, the dispatcher assigns it to the server pool with the minimum number of tasks among $d(N)$ randomly selected server pools. This assignment strategy is called the JSQ$(d(N))$ scheme, as it resembles the power-of-$d$ version of the Join-the-Shortest-Queue (JSQ) policy, and will also be referred to as such in the special case $d(N) = N$. We construct a stochastic coupling to bound the difference in the system occupancy processes between the JSQ policy and a scheme with an arbitrary value of $d(N)$. We use the coupling to derive the fluid limit in case $d(N) \to \infty$ and $λ(N)/N \to λ$ as $N \to \infty$, along with the associated fixed point. The fluid limit turns out to be insensitive to the exact growth rate of $d(N)$, and coincides with that for the JSQ policy. We further leverage the coupling to establish that the diffusion limit corresponds to that for the JSQ policy as well, as long as $d(N)/\sqrt{N} \log(N) \to \infty$, and characterize the common limiting diffusion process. These results indicate that the JSQ optimality can be preserved at the fluid-level and diffusion-level while reducing the overhead by nearly a factor O($N$) and O($\sqrt{N}/\log(N)$), respectively.

preprint2016arXiv

Generalized random sequential adsorption on Erdős-Rényi random graphs

We investigate Random Sequential Adsorption (RSA) on a random graph via the following greedy algorithm: Order the $n$ vertices at random, and sequentially declare each vertex either active or frozen, depending on some local rule in terms of the state of the neighboring vertices. The classical RSA rule declares a vertex active if none of its neighbors is, in which case the set of active nodes forms an independent set of the graph. We generalize this nearest-neighbor blocking rule in three ways and apply it to the Erdős-Rényi random graph. We consider these generalizations in the large-graph limit $n\to\infty$ and characterize the jamming constant, the limiting proportion of active vertices in the maximal greedy set.

Debankur Mukherjee

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Higher-Order Approximations of Sojourn Times in M/G/1 Queues via Stein's Method

SCaLE: Switching Cost aware Learning and Exploration

Mean-field Analysis for Load Balancing on Spatial Graphs

Online Capacity Scaling Augmented With Unreliable Machine Learning Predictions

Scalable load balancing in networked systems: A survey of recent advances

Load Balancing Under Strict Compatibility Constraints

Join-the-Shortest Queue Diffusion Limit in Halfin-Whitt Regime: Sensitivity on the Heavy-traffic Parameter

Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

Supermarket Model on Graphs

Asymptotic Optimality of Power-of-$d$ Load Balancing in Large-Scale Systems

Generalized random sequential adsorption on Erdős-Rényi random graphs