Source author record

Jiheng Zhang

Jiheng Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning math.OC Artificial Intelligence Cryptography and Security Performance Systems and Control

Catalog footprint

What is connected

12works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning to Bid with Unknown Private Values in Budget-Constrained First-Price Auctions

The transition to First-Price Auctions (FPA) in digital advertising has spurred significant research, yet existing work typically assumes access to a valuation oracle, ignoring the reality that values must be inferred from censored data. While Linear Treatment Effect (LTE) models address this by learning value uplift, they have not been adapted to realistic settings with hard Budget constraints or Return-on-Spend (RoS) targets requiring regret and violation control. In this work, we propose a unified primal-dual framework for constrained FPAs that jointly learns the latent LTE valuation parameters and the competitor's bid distribution. This simultaneous learning introduces a critical technical challenge: the estimation error is dynamically scaled by the Lagrangian multiplier, potentially leading to unbounded regret. We resolve this by leveraging a strong Slater condition and a novel adaptive burn-in procedure to stabilize the dual variables. Our approach achieves near-optimal regret guarantees, providing the first theoretically grounded solution for constrained bidding with latent valuations.

preprint2024arXiv

Stochastic Graph Bandit Learning with Side-Observations

In this paper, we investigate the stochastic contextual bandit with general function space and graph feedback. We propose an algorithm that addresses this problem by adapting to both the underlying graph structures and reward gaps. To the best of our knowledge, our algorithm is the first to provide a gap-dependent upper bound in this stochastic setting, bridging the research gap left by the work in [35]. In comparison to [31,33,35], our method offers improved regret upper bounds and does not require knowledge of graphical quantities. We conduct numerical experiments to demonstrate the computational efficiency and effectiveness of our approach in terms of regret upper bounds. These findings highlight the significance of our algorithm in advancing the field of stochastic contextual bandits with graph feedback, opening up avenues for practical applications in various domains.

preprint2022arXiv

Dual Instrumental Method for Confounded Kernelized Bandits

The contextual bandit problem is a theoretically justified framework with wide applications in various fields. While the previous study on this problem usually requires independence between noise and contexts, our work considers a more sensible setting where the noise becomes a latent confounder that affects both contexts and rewards. Such a confounded setting is more realistic and could expand to a broader range of applications. However, the unresolved confounder will cause a bias in reward function estimation and thus lead to a large regret. To deal with the challenges brought by the confounder, we apply the dual instrumental variable regression, which can correctly identify the true reward function. We prove the convergence rate of this method is near-optimal in two types of widely used reproducing kernel Hilbert spaces. Therefore, we can design computationally efficient and regret-optimal algorithms based on the theoretical guarantees for confounded bandit problems. The numerical results illustrate the efficacy of our proposed algorithms in the confounded bandit setting.

preprint2022arXiv

On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits

Differentially private (DP) stochastic convex optimization (SCO) is ubiquitous in trustworthy machine learning algorithm design. This paper studies the DP-SCO problem with streaming data sampled from a distribution and arrives sequentially. We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms. Despite that numerous algorithms have been developed to achieve the optimal excess risks in different $\ell_p$ norm geometries, yet none of the existing ones can be adapted to the streaming and continual release setting. To address such a challenge as the online convex optimization with privacy protection, we propose a private variant of online Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data. Combined with the adaptive differential privacy analysis, our online algorithm achieves in linear time the optimal excess risk when $1<p\leq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2<p\leq\infty$. Our algorithm can also be extended to the case $p=1$ to achieve nearly dimension-independent excess risk. While previous variance reduction results on recursive gradient have theoretical guarantee only in the independent and identically distributed sample setting, we establish such a guarantee in a non-stationary setting. To demonstrate the virtues of our method, we design the first DP algorithm for high-dimensional generalized linear bandits with logarithmic regret. Comparative experiments with a variety of DP-SCO and DP-Bandit algorithms exhibit the efficacy and utility of the proposed algorithms.

preprint2016arXiv

Instantaneous Control of Brownian Motion with a Positive Lead Time

Consider a storage system where the content is driven by a Brownian motion absent control. At any time, one may increase or decrease the content at a cost proportional to the amount of adjustment. A decrease of the content takes effect immediately, while an increase is realized after a fixed lead time $\lt$. Holding costs are incurred continuously over time and are a convex function of the content. The objective is to find a control policy that minimizes the expected present value of the total costs. Due to the positive lead time for upward adjustments, one needs to keep track of all the outstanding upward adjustments as well as the actual content at time $t$ as there may also be downward adjustments during $[t,t+\lt)$, i.e., the state of the system is a function on $[0,\ell]$. To the best of our knowledge, this is the first paper to study instantaneous control of stochastic systems in such a functional setting. We first extend the concept of $L^\natural$-convexity to function spaces and establish the $L^\natural$-convexity of the optimal cost function. We then derive various properties of the cost function and identify the structure of the optimal policy as a state-dependent two-sided reflection mapping making the minimum amount of adjustment necessary to keep the system states within a certain region.

preprint2015arXiv

A Unified Approach to Diffusion Analysis of Queues with General Patience-time Distributions

We propose a unified approach to establishing diffusion approximations for queues with impatient customers within a general framework of scaling customer patience time. The approach consists of two steps. The first step is to show that the diffusion-scaled abandonment process is asymptotically close to a function of the diffusion-scaled queue length process under appropriate conditions. The second step is to construct a continuous mapping not only to characterize the system dynamics using the system primitives, but also to help verify the conditions needed in the first step. The diffusion approximations can then be obtained by applying the continuous mapping theorem. The approach has two advantages: (i) it provides a unified procedure to establish the diffusion approximations regardless of the structure of the queueing model or the type of patience-time scaling; (ii) and it makes the diffusion analysis of queues with customer abandonment essentially the same as the diffusion analysis of queues without customer abandonment. We demonstrate the application of this approach via the single server system with Markov-modulated service speeds in the traditional heavy-traffic regime and the many-server system in the Halfin-Whitt regime and the non-degenerate slowdown regime.

preprint2015arXiv

Insensitivity of Proportional Fairness in Critically Loaded Bandwidth Sharing Networks

Proportional fairness is a popular service allocation mechanism to describe and analyze the performance of data networks at flow level. Recently, several authors have shown that the invariant distribution of such networks admits a product form distribution under critical loading. Assuming exponential job size distributions, they leave the case of general job size distributions as an open question. In this paper we show the conjecture holds for a dense class of distributions. This yields a key example of a stochastic network in which the heavy traffic limit has an invariant distribution that does not depend on second moments. Our analysis relies on a uniform convergence result for a fluid model which may be of independent interest.

preprint2014arXiv

Approximations and Optimal Control for State-dependent Limited Processor Sharing Queues

The paper studies approximations and control of a processor sharing (PS) server where the service rate depends on the number of jobs occupying the server. The control of such a system is implemented by imposing a limit on the number of jobs that can share the server concurrently, with the rest of the jobs waiting in a first-in-first-out (FIFO) buffer. A desirable control scheme should strike the right balance between efficiency (operating at a high service rate) and parallelism (preventing small jobs from getting stuck behind large ones). We employ the framework of heavy-traffic diffusion analysis to devise near optimal control heuristics for such a queueing system. However, while the literature on diffusion control of state-dependent queueing systems begins with a sequence of systems and an exogenously defined drift function, we begin with a finite discrete PS server and propose an axiomatic recipe to explicitly construct a sequence of state-dependent PS servers which then yields a drift function. We establish diffusion approximations and use them to obtain insightful and closed-form approximations for the original system under a static concurrency limit control policy. We extend our study to control policies that dynamically adjust the concurrency limit. We provide two novel numerical algorithms to solve the associated diffusion control problem. Our algorithms can be viewed as "average cost" iteration: The first algorithm uses binary-search on the average cost and can find an $ε$-optimal policy in time $O\left( \log^2 \frac{1}ε \right)$; the second algorithm uses the Newton-Raphson method for root-finding and requires $O\left( \log \frac{1}ε \log\log \frac{1}ε\right)$ time. Numerical experiments demonstrate the accuracy of our approximation for choosing optimal or near-optimal static and dynamic concurrency control heuristics.

preprint2014arXiv

Separation of timescales in a two-layered network

We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic: the main source of randomness occurs at the (aggregate) CPU layer; the interactions between different types of nodes at the other layer is shown to converge to a fixed point at a faster time scale; this also yields a state-space collapse property. Apart from these fundamental insights, we also obtain an explicit approximation for the joint law of the number of jobs in the system, which is provably accurate for heavily loaded systems and performs numerically well for moderately loaded systems. The obtained results for the model under consideration can be applied to thread-pool dimensioning in application servers, while the technique seems applicable to other layered systems too.

preprint2013arXiv

Convergence to Equilibrium States for Fluid Models of Many-server Queues with Abandonment

Fluid models have become an important tool for the study of many-server queues with general service and patience time distributions. The equilibrium state of a fluid model has been revealed by Whitt (2006) and shown to yield reasonable approximations to the steady state of the original stochastic systems. However, it remains an open question whether the solution to a fluid model converges to the equilibrium state and under what condition. We show in this paper that the convergence holds under a mild condition. Our method builds on the framework of measure-valued processes developed in Zhang (2013), which keeps track of the remaining patience and service times.

preprint2011arXiv

Diffusion limits of limited processor sharing queues

We consider a processor sharing queue where the number of jobs served at any time is limited to $K$, with the excess jobs waiting in a buffer. We use random counting measures on the positive axis to model this system. The limit of this measure-valued process is obtained under diffusion scaling and heavy traffic conditions. As a consequence, the limit of the system size process is proved to be a piece-wise reflected Brownian motion.

preprint2009arXiv

Fluid Models of Many-server Queues with Abandonment

We study many-server queues with abandonment in which customers have general service and patience time distributions. The dynamics of the system are modeled using measure- valued processes, to keep track of the residual service and patience times of each customer. Deterministic fluid models are established to provide first-order approximation for this model. The fluid model solution, which is proved to uniquely exists, serves as the fluid limit of the many-server queue, as the number of servers becomes large. Based on the fluid model solution, first-order approximations for various performance quantities are proposed.