Source author record

Zhongyi Huang

Zhongyi Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA quant-ph math-ph math.MP physics.soc-ph Artificial Intelligence Computational Engineering, Finance, and Science cs.CY Distributed, Parallel, and Cluster Computing Information Theory math.IT Numerical Analysis q-fin.CP

Catalog footprint

What is connected

11works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference

LLM serving platforms are increasingly deployed as multi-model cloud systems, where user demand is often long-tailed: a few popular large models receive most requests, while many smaller tail models remain underutilized. We propose \textbf{SPECTRE} (Parallel \textbf{SPEC}ulative Decoding with a Multi-\textbf{T}enant \textbf{RE}mote Drafter), a serving framework that reuses underutilized tail-model services as remote drafters for heavily loaded large-model services through speculative decoding. SPECTRE enables draft generation and target-side verification to run in parallel, and makes such parallelism effective through three techniques: a hybrid ordinary-parallel speculative decoding strategy guided by a threshold derived from throughput analysis, speculative priority scheduling to preserve draft--target overlap under multi-tenant traffic, and draft-side prompt compression to reduce draft latency. We implement SPECTRE in \texttt{SGLang} and evaluate it across multiple draft--target model pairs, reasoning benchmarks, real-world long-context workloads, and a wide range of batch sizes. Results show that SPECTRE consistently improves large-model serving throughput while causing only minor interference to the native workloads of tail-model services. In large-model deployments, including Qwen3-235B-A22B with TP=8, SPECTRE achieves up to \textbf{2.28$\times$ speedup} over autoregressive decoding and up to an additional \textbf{66\% relative improvement} over the strongest speculative decoding baselines. Talk is cheap, we show you the code: https://github.com/sgl-project/sglang/pull/22272.

preprint2022arXiv

A Uniform Convergent Petrov-Galerkin method for a Class of Turning Point Problems

In this paper, we propose a numerical method for turning point problems in one dimension based on Petrov-Galerkin finite element method (PGFEM). We first give a priori estimate for the turning point problem with a single boundary turning point. Then we use PGFEM to solve it, where test functions are the solutions to piecewise approximate dual problems. We prove that our method has a first-order convergence rate in both $L^\infty$ norm and an energy norm when we select the exact solutions to dual problems as test functions. Numerical results show that our scheme is efficient for turning point problems with different types of singularities, and the convergency coincides with our theoretical results.

preprint2022arXiv

Two New Piggybacking Designs with Lower Repair Bandwidth

Piggybacking codes are a special class of MDS array codes that can achieve small repair bandwidth with small sub-packetization by first creating some instances of an $(n,k)$ MDS code, such as a Reed-Solomon (RS) code, and then designing the piggyback function. In this paper, we propose a new piggybacking coding design which designs the piggyback function over some instances of both $(n,k)$ MDS code and $(n,k')$ MDS code, when $k\geq k'$. We show that our new piggybacking design can significantly reduce the repair bandwidth for single-node failures. When $k=k'$, we design piggybacking code that is MDS code and we show that the designed code has lower repair bandwidth for single-node failures than all existing piggybacking codes when the number of parity node $r=n-k\geq8$ and the sub-packetization $α<r$. Moreover, we propose another piggybacking codes by designing $n$ piggyback functions of some instances of $(n,k)$ MDS code and adding the $n$ piggyback functions into the $n$ newly created empty entries with no data symbols. We show that our code can significantly reduce repair bandwidth for single-node failures at a cost of slightly more storage overhead. In addition, we show that our code can recover any $r+1$ node failures for some parameters. We also show that our code has lower repair bandwidth than locally repairable codes (LRCs) under the same fault-tolerance and redundancy for some parameters.

preprint2021arXiv

Investigating the effect of expected travel distance on individual descent speed in the stairwell with super long distance

Currently, there is an increasing number of super high-rise buildings in urban cities, the issue of evacuation in emergencies from such buildings comes to the fore. An evacuation experiment was carried out by our group in Shanghai Tower, it was found that the evacuation speed of pedestrians evacuated from the 126th floor was always slower than that of those from the 117th floor. Therefore, we propose a hypothesis that the expected evacuation distance will affect pedestrians' movement speed. In order to verify our conjecture, we conduct an experiment in a 12-story office building, that is, to study whether there would be an influence and what kind of influence would be caused on speed by setting the evacuation distance for participants in advance. According to the results, we find that with the increase of expected evacuation distance, the movement speed of pedestrians will decrease, which confirms our hypothesis. At the same time, we give the relation between the increase rate of evacuation distance and the decrease rate of speed. It also can be found that with the increase of expected evacuation distance, the speed decrease rate of the male is greater than that for female. In addition, we study the effects of actual evacuation distance, gender, BMI on evacuation speed. Finally, we obtain the correlation between heart rate and speed during evacuation. The results in this paper are beneficial to the study of pedestrian evacuation in super high-rise buildings.

preprint2020arXiv

An iterative splitting method for pricing European options under the Heston model

In this paper, we propose an iterative splitting method to solve the partial differential equations in option pricing problems. We focus on the Heston stochastic volatility model and the derived two-dimensional partial differential equation (PDE). We take the European option as an example and conduct numerical experiments using different boundary conditions. The iterative splitting method transforms the two-dimensional equation into two quasi one-dimensional equations with the variable on the other dimension fixed, which helps to lower the computational cost. Numerical results show that the iterative splitting method together with an artificial boundary condition (ABC) based on the method by Li and Huang (2019) gives the most accurate option price and Greeks compared to the classic finite difference method with the commonly-used boundary conditions in Heston (1993).

preprint2020arXiv

How many infections of COVID-19 there will be in the "Diamond Princess"-Predicted by a virus transmission model based on the simulation of crowd flow

Objectives: Simulate the transmission process of COVID-19 in a cruise ship, and then to judge how many infections there will be in the 3711 people in the "Diamond Princess" and analyze measures that could have prevented mass transmission. Methods: Based on the crowd flow model, the virus transmission rule between pedestrians is established, to simulate the spread of the virus caused by the close contact during pedestrians' daily activities on the cruise ship. Measurements and main results: Three types of simulation scenarios are designed, the Basic scenario focus on the process of virus transmission caused by a virus carrier and the effect of the personal protective measure against the virus. The condition that the original virus carriers had disembarked halfway and more and more people strengthen self-protection are considered in the Self-protection scenario, which would comparatively accord with the actual situation of "Diamond princess" cruise. Control scenario are set to simulate the effect of taking recommended or mandatory measures on virus transmission Conclusions: There are 850~1009 persons (with large probability) who have been infected with COVID-19 during the voyage of "Diamond Princess". The crowd infection percentage would be controlled effectively if the recommended or mandatory measures can be taken immediately during the alert phase of COVID-19 outbreaks.

preprint2016arXiv

A Bloch decomposition-based stochastic Galerkin method for quantum dynamics with a random external potential

In this paper, we consider the numerical solution of the one-dimensional Schrödinger equation with a periodic lattice potential and a random external potential. This is an important model in solid state physics where the randomness is involved to describe some complicated phenomena that are not exactly known. Here we generalize the Bloch decomposition-based time-splitting pseudospectral method to the stochastic setting using the generalize polynomial chaos with a Galerkin procedure so that the main effects of dispersion and periodic potential are still computed together. We prove that our method is unconditionally stable and numerical examples show that it has other nice properties and is more efficient than the traditional method. Finally, we give some numerical evidence for the well-known phenomenon of Anderson localization.

preprint2014arXiv

Numerical simulations of X-rays Free Electron Lasers (XFEL)

We study a nonlinear Schrödinger equation which arises as an effective single particle model in X-ray Free Electron Lasers (XFEL). This equation appears as a first-principles model for the beam-matter interactions that would take place in an XFEL molecular imaging experiment in \cite{frat1}. Since XFEL is more powerful by several orders of magnitude than more conventional lasers, the systematic investigation of many of the standard assumptions and approximations has attracted increased attention. In this model the electrons move under a rapidly oscillating electromagnetic field, and the convergence of the problem to an effective time-averaged one is examined. We use an operator splitting pseudo-spectral method to investigate numerically the behaviour of the model versus its time-averaged version in complex situations, namely the energy subcritical/mass supercritical case, and in the presence of a periodic lattice. We find the time averaged model to be an effective approximation, even close to blowup, for fast enough oscillations of the external field. This work extends previous analytical results for simpler cases \cite{xfel1}.

preprint2012arXiv

A Bloch decomposition based split-step pseudo spectral method for quantum dynamics with periodic potentials

We present a new numerical method for accurate computations of solutions to (linear) one dimensional Schrödinger equations with periodic potentials. This is a prominent model in solid state physics where we also allow for perturbations by non-periodic potentials describing external electric fields. Our approach is based on the classical Bloch decomposition method which allows to diagonalize the periodic part of the Hamiltonian operator. Hence, the dominant effects from dispersion and periodic lattice potential are computed together, while the non-periodic potential acts only as a perturbation. Because the split-step communicator error between the periodic and non-periodic parts is relatively small, the step size can be chosen substantially larger than for the traditional splitting of the dispersion and potential operators. Indeed it is shown by the given examples, that our method is unconditionally stable and more efficient than the traditional split-step pseudo spectral schemes. To this end a particular focus is on the semiclassical regime, where the new algorithm naturally incorporates the adiabatic splitting of slow and fast degrees of freedom.

preprint2012arXiv

A time-splitting spectral scheme for the Maxwell-Dirac system

We present a time-splitting spectral scheme for the Maxwell-Dirac system and similar time-splitting methods for the corresponding asymptotic problems in the semi-classical and the non-relativistic regimes. The scheme for the Maxwell-Dirac system conserves the Lorentz gauge condition, is unconditionally stable and highly efficient as our numerical examples show. In particular we focus in our examples on the creation of positronic modes in the semi-classical regime and on the electron-positron interaction in the non-relativistic regime. Furthermore, in the non-relativistic regime, our numerical method exhibits uniform convergence in the small parameter $\dt$, which is the ratio of the characteristic speed and the speed of light.

preprint2012arXiv

Gaussian Beam Methods for the Dirac Equation in the Semi-classical Regime

The Dirac equation is an important model in relativistic quantum mechanics. In the semi-classical regime $ε\ll1$, even a spatially spectrally accurate time splitting method \cite{HuJi:05} requires the mesh size to be $O(ε)$, which makes the direct simulation extremely expensive. In this paper, we present the Gaussian beam method for the Dirac equation. With the help of an eigenvalue decomposition, the Gaussian beams can be independently evolved along each eigenspace and summed to construct an approximate solution of the Dirac equation. Moreover, the proposed Eulerian Gaussian beam keeps the advantages of constructing the Hessian matrices by simply using level set functions' derivatives. Finally, several numerical examples show the efficiency and accuracy of the method.

Zhongyi Huang

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference

A Uniform Convergent Petrov-Galerkin method for a Class of Turning Point Problems

Two New Piggybacking Designs with Lower Repair Bandwidth

Investigating the effect of expected travel distance on individual descent speed in the stairwell with super long distance

An iterative splitting method for pricing European options under the Heston model

How many infections of COVID-19 there will be in the "Diamond Princess"-Predicted by a virus transmission model based on the simulation of crowd flow

A Bloch decomposition-based stochastic Galerkin method for quantum dynamics with a random external potential

Numerical simulations of X-rays Free Electron Lasers (XFEL)

A Bloch decomposition based split-step pseudo spectral method for quantum dynamics with periodic potentials

A time-splitting spectral scheme for the Maxwell-Dirac system

Gaussian Beam Methods for the Dirac Equation in the Semi-classical Regime