Source author record

Nian Si

Nian Si appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence math.OC Computational Engineering, Finance, and Science math.PR math.ST Statistics Theory

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Queueing-Theoretic Framework for Stability Analysis of LLM Inference with KV Cache Memory Constraints

The rapid adoption of large language models (LLMs) has created significant challenges for efficient inference at scale. Unlike traditional workloads, LLM inference is constrained by both computation and the memory overhead of key-value (KV) caching, which accelerates decoding but quickly exhausts GPU memory. In this paper, we introduce the first queueing-theoretic framework that explicitly incorporates both computation and GPU memory constraints into the analysis of LLM inference. Based on this framework, we derive rigorous stability and instability conditions that determine whether an LLM inference service can sustain incoming demand without unbounded queue growth. This result offers a powerful tool for system deployment, potentially addressing the core challenge of GPU provisioning. By combining an estimated request arrival rate with our derived stable service rate, operators can calculate the necessary cluster size to avoid both costly over-purchasing and performance-violating under-provisioning. We further validate our theoretical predictions through extensive experiments in real GPU production environments. Our results show that the predicted stability conditions are highly accurate, with deviations typically within 10%.

preprint2026arXiv

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Building upon FutureX, which established a live benchmark for general-purpose future prediction, this report introduces FutureX-Pro, including FutureX-Finance, FutureX-Retail, FutureX-PublicHealth, FutureX-NaturalDisaster, and FutureX-Search. These together form a specialized framework extending agentic future prediction to high-value vertical domains. While generalist agents demonstrate proficiency in open-domain search, their reliability in capital-intensive and safety-critical sectors remains under-explored. FutureX-Pro targets four economically and socially pivotal verticals: Finance, Retail, Public Health, and Natural Disaster. We benchmark agentic Large Language Models (LLMs) on entry-level yet foundational prediction tasks -- ranging from forecasting market indicators and supply chain demands to tracking epidemic trends and natural disasters. By adapting the contamination-free, live-evaluation pipeline of FutureX, we assess whether current State-of-the-Art (SOTA) agentic LLMs possess the domain grounding necessary for industrial deployment. Our findings reveal the performance gap between generalist reasoning and the precision required for high-value vertical applications.

preprint2021arXiv

Confidence Regions in Wasserstein Distributionally Robust Estimation

Wasserstein distributionally robust optimization estimators are obtained as solutions of min-max problems in which the statistician selects a parameter minimizing the worst-case loss among all probability models within a certain distance (in a Wasserstein sense) from the underlying empirical measure. While motivated by the need to identify optimal model parameters or decision choices that are robust to model misspecification, these distributionally robust estimators recover a wide range of regularized estimators, including square-root lasso and support vector machines, among others, as particular cases. This paper studies the asymptotic normality of these distributionally robust estimators as well as the properties of an optimal (in a suitable sense) confidence region induced by the Wasserstein distributionally robust optimization formulation. In addition, key properties of min-max distributionally robust optimization problems are also studied, for example, we show that distributionally robust estimators regularize the loss based on its derivative and we also derive general sufficient conditions which show the equivalence between the min-max distributionally robust optimization problem and the corresponding max-min formulation.

preprint2020arXiv

Efficient Steady-state Simulation of High-dimensional Stochastic Networks

We propose and study an asymptotically optimal Monte Carlo estimator for steady-state expectations of a d-dimensional reflected Brownian motion. Our estimator is asymptotically optimal in the sense that it requires $\tilde{O}(d)$ (up to logarithmic factors in $d$) i.i.d. Gaussian random variables in order to output an estimate with a controlled error. Our construction is based on the analysis of a suitable multi-level Monte Carlo strategy which, we believe, can be applied widely. This is the first algorithm with linear complexity (under suitable regularity conditions) for steady-state estimation of RBM as the dimension increases.

preprint2020arXiv

Robust Bayesian Classification Using an Optimistic Score Ratio

We build a Bayesian contextual classification model using an optimistic score ratio for robust binary classification when there is limited information on the class-conditional, or contextual, distribution. The optimistic score searches for the distribution that is most plausible to explain the observed outcomes in the testing sample among all distributions belonging to the contextual ambiguity set which is prescribed using a limited structural constraint on the mean vector and the covariance matrix of the underlying contextual distribution. We show that the Bayesian classifier using the optimistic score ratio is conceptually attractive, delivers solid statistical guarantees and is computationally tractable. We showcase the power of the proposed optimistic score ratio classifier on both synthetic and empirical data.

Nian Si

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

A Queueing-Theoretic Framework for Stability Analysis of LLM Inference with KV Cache Memory Constraints

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Confidence Regions in Wasserstein Distributionally Robust Estimation

Efficient Steady-state Simulation of High-dimensional Stochastic Networks

Robust Bayesian Classification Using an Optimistic Score Ratio