Researcher profile

Zhihui Li

Zhihui Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

We present MindLab Toolkit (MinT), a managed infrastructure system for Low-Rank Adaptation (LoRA) post-training and online serving. MinT targets a setting where many trained policies are produced over a small number of expensive base-model deployments. Instead of materializing each policy as a merged full checkpoint, MinT keeps the base model resident and moves exported LoRA adapter revisions through rollout, update, export, evaluation, serving, and rollback, hiding distributed training, serving, scheduling, and data movement behind a service interface. MinT scales this path along three axes. Scale Up extends LoRA RL to frontier-scale dense and MoE architectures, including MLA and DSA attention paths, with training and serving validated beyond 1T total parameters. Scale Down moves only the exported LoRA adapter, which can be under 1% of base-model size in rank-1 settings; adapter-only handoff reduces the measured step by 18.3x on a 4B dense model and 2.85x on a 30B MoE, while concurrent multi-policy GRPO shortens wall time by 1.77x and 1.45x without raising peak memory. Scale Out separates durable policy addressability from CPU/GPU working sets: a tensor-parallel deployment supports 10^6-scale addressable catalogs (measured single-engine sweeps through 100K) and thousand-adapter active waves at cluster scale, with cold loading treated as scheduled service work and packed MoE LoRA tensors improving live engine loading by 8.5-8.7x. MinT thus manages million-scale LoRA policy catalogs while training and serving selected adapter revisions over shared 1T-class base models.

preprint2026arXiv

SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

Normalizing flows (NFs) provide exact likelihoods and deterministic invertible sampling, but have historically lagged behind diffusion models for large-scale image generation. We identify a key obstacle: NFs are required to learn a single invertible transport over the full ambient space, making them highly sensitive to high-dimensional representations. This leads to a semantic-capacity mismatch in modern visual representation spaces, where semantic information is compact but encoded in overcomplete features. We propose SRC-Flow, which introduces a Semantic Representation Compressor (SRC) to compact high-dimensional RAE features into a low-dimensional semantic space before flow modeling and preserve reconstruction through the frozen RAE decoder. This compact space reduces the modeling burden of NFs and enables effective likelihood-based generation in semantic representation space. We further adopt constant noise regularization tailored to the fixed unconditional bijection learned by flows. On ImageNet $256 \times 256$ and $512 \times 512$, SRC-Flow achieves state-of-the-art generation quality among normalizing flow methods, with gFID scores of 1.65 and 2.07 under classifier-free guidance, while retaining exact likelihood computation in the compact semantic representation space and deterministic invertible sampling at the flow level. Codes and models will be available at https://github.com/longtaojiang/SRC-Flow.

preprint2022arXiv

A Comprehensive Survey of Scene Graphs: Generation and Application

Scene graph is a structured representation of a scene that can clearly express the objects, attributes, and relationships between objects in the scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting and recognizing objects in images; instead, people look forward to a higher level of understanding and reasoning about visual scenes. For example, given an image, we want to not only detect and recognize objects in the image, but also know the relationship between objects (visual relationship detection), and generate a text description (image captioning) based on the image content. Alternatively, we might want the machine to tell us what the little girl in the image is doing (Visual Question Answering (VQA)), or even remove the dog from the image and find similar images (image editing and retrieval), etc. These tasks require a higher level of understanding and reasoning for image vision tasks. The scene graph is just such a powerful tool for scene understanding. Therefore, scene graphs have attracted the attention of a large number of researchers, and related research is often cross-modal, complex, and rapidly developing. However, no relatively systematic survey of scene graphs exists at present. To this end, this survey conducts a comprehensive investigation of the current scene graph research. More specifically, we first summarized the general definition of the scene graph, then conducted a comprehensive and systematic discussion on the generation method of the scene graph (SGG) and the SGG with the aid of prior knowledge. We then investigated the main applications of scene graphs and summarized the most commonly used datasets. Finally, we provide some insights into the future development of scene graphs. We believe this will be a very helpful foundation for future research on scene graphs.

preprint2022arXiv

A physical perturbation based study on the prediction of free-fall disks with chaotic modes in the water

We report a phenomenon that physical perturbations sometimes can benefit the certainty of a free-fall motion with chaotic modes, albeit, as commonly believed, they can ruin it. We statistically compare those factors that may lead to uncertainty, by which we find that the growth of the standard deviation of the landing locations is directly determined by the physical perturbations. A significant yardstick is defined in the meantime. This temporal criterion is of big relevance to the replicability of such problems experimentally, although they are inherently chaotic. Our hypothesis is verified by experiments from other literature. This outcome also provides a practical strategy to evaluate the credible prediction time by estimating the disturbances from physical parameters as a priori.

preprint2022arXiv

Where Outflows Meet Inflows: Gas Kinematics in SSA22 Lyman-$α$ Blob 2 Decoded by Advanced Radiative Transfer Modelling

We present new spectroscopic observations of Lyman-$α$ (Ly$α$) Blob 2 ($z \sim$ 3.1). We observed extended Ly$α$ emission in three distinct regions, where the highest Ly$α$ surface brightness (SB) center is far away from the known continuum sources. We searched through the MOSFIRE slits that cover the high Ly$α$ SB regions, but were unable to detect any significant nebular emission near the highest SB center. We further mapped the flux ratio of the blue peak to the red peak and found it is anti-correlated with Ly$α$ SB with a power-law index of $\sim$ -0.4. We used radiative transfer models with both multiphase, clumpy and shell geometries and successfully reproduced the diverse Ly$α$ morphologies. We found that most spectra suggest outflow-dominated kinematics, while 4/15 spectra imply inflows. A significant correlation exists between parameter pairs, and the multiphase, clumpy model may alleviate previously reported discrepancies. We also modeled Ly$α$ spectra at different positions simultaneously and found that the variation of the inferred clump outflow velocities can be approximately explained by line-of-sight projection effects. Our results support the `central powering + scattering' scenario, i.e. the Ly$α$ photons are generated by a central powering source and then scatter with outflowing, multiphase HI gas while propagating outwards. The infalling of cool gas near the blob outskirts shapes the observed blue-dominated Ly$α$ profiles, but its energy contribution to the total Ly$α$ luminosity is less than 10%, i.e. minor compared to the photo-ionization by star-forming galaxies and/or AGNs.

preprint2021arXiv

A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

Deep learning has made breakthroughs and substantial in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers' prior knowledge and experience. And due to the limitations of human' inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. Besides, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.

preprint2021arXiv

Accurate predictions of chaotic motion of a free fall disk

It is important to know the accurate trajectory of a free fall object in fluid (such as a spacecraft), whose motion might be chaotic in many cases. However, it is impossible to accurately predict its chaotic trajectory in a long enough duration by traditional numerical algorithms in double precision. In this paper, we give the accurate predictions of the same problem by a new strategy, namely the Clean Numerical Simulation (CNS). Without loss of generality, a free fall disk in water is considered, whose motion is governed by the Andersen-Pesavento-Wang model. We illustrate that convergent and reliable trajectories of a chaotic free fall disk in a long enough interval of time can be obtained by means of the CNS, but different traditional algorithms in double precision give disparate trajectories. Besides, unlike the traditional algorithms in double precision, the CNS can predict the accurate posture of the free fall disk near the vicinity of the bifurcation point of some physical parameters in a long duration. Therefore, the CNS can provide reliable prediction of chaotic systems in a long enough interval of time.

preprint2021arXiv

Deciphering the Lyman-$α$ Emission Line: Towards the Understanding of Galactic Properties Extracted from Ly$α$ Spectra via Radiative Transfer Modeling

Existing ubiquitously in the Universe with the highest luminosity, the Lyman-$α$ emission line encodes abundant physical information about the gaseous medium it interacts with. Nevertheless, the resonant nature of Ly$α$ complicates the radiative transfer (RT) modeling of the line profile, making the extraction of physical properties of the surrounding gaseous medium notoriously difficult. In this paper, we revisit the problem of deciphering the Ly$α$ emission line with RT modeling. We reveal intrinsic parameter degeneracies in the widely-used shell model in the optically thick regime for both static and outflowing cases, which suggest the limitations of the model. We have also explored the connection between the more physically realistic multiphase, clumpy model and the shell model. We find that the parameters of a ``very clumpy'' slab model and the shell model have the following correspondences: (1) the total column density of the clumpy slab model is equal to the HI column density of the shell model; (2) the effective temperature of the clumpy slab model, which incorporates the clump velocity dispersion, is equal to the effective temperature of the shell model; (3) the average radial clump outflow velocity is equal to the shell expansion velocity; (4) large intrinsic line widths are required in the shell model to reproduce the wings of the clumpy slab models; (5) adding another phase of hot inter-clump medium will increase peak separation, and the fitted shell expansion velocity lies between the outflow velocities of two phases of gas. Our results provide a viable solution to the major discrepancies associated with Ly$α$ fitting reported in previous literature, and emphasize the importance of utilizing information from additional observations to break the intrinsic degeneracies as well as interpreting the model parameters in a more physically realistic context.

preprint2021arXiv

Projective robustness for quantum channels and measurements and their operational significance

Recently, the projective robustness of quantum states has been introduced in [arXiv:2109.04481(2021)]. It shows that the projective robustness is a useful resource monotone and can comprehensively characterize capabilities and limitations of probabilistic protocols manipulating quantum resources deterministically. In this paper, we will extend the projective robustness to any convex resource theories of quantum channels and measurements. First, We introduce the projective robustness of quantum channels and prove that it satisfies some good properties, especially sub- or supermultiplicativity under any free quantum process. Moreover, we use the projective robustness of channels to give lower bounds on the errors and overheads in any channel resource distillation. Meanwhile, we show that the projective robustness of channels quantifies the maximal advantage that a given channel outperforms all free channels in simultaneous discrimination and exclusion of a fixed state ensemble. Second, we define the projective robustness of quantum measurements and prove that it exactly quantifies the maximal advantage that a given measurement provides over all free measurements in simultaneous discrimination and exclusion of two fixed state ensembles. Finally, within a specific channel resource setting based on measurement incompatibility, we show that the projective robustness of quantum channels coincides with the projective robustness of measurement incompatibility.

preprint2021arXiv

Revisiting the Gas Kinematics in SSA22 Lyman-$α$ Blob 1 with Radiative Transfer Modeling in a Multiphase, Clumpy Medium

We present new observations of Lyman-$α$ (Ly$α$) Blob 1 (LAB1) in the SSA22 protocluster region ($z$ = 3.09) using the Keck Cosmic Web Imager (KCWI) and the Keck Multi-object Spectrometer for Infrared Exploration (MOSFIRE). By applying matched filtering to the KCWI datacube, we have created a narrow-band Ly$α$ image and identified several prominent features. By comparing the spatial distributions and intensities of Ly$α$ and H$β$, we find that recombination of photo-ionized HI gas followed by resonant scattering is sufficient to explain all the observed Ly$α$/H$β$ ratios. We further decode the spatially-resolved Ly$α$ profiles using both moment maps and Monte-Carlo radiative transfer (MCRT) modeling. By fitting a set of multiphase, 'clumpy' models to the observed Ly$α$ profiles, we are able to reasonably constrain many parameters, namely the HI number density in the inter-clump medium (ICM), the cloud volume filling factor, the random velocity and outflow velocity of the clumps, the HI outflow velocity of the ICM and the local systemic redshift. Our model has successfully reproduced the diverse Ly$α$ morphologies at different locations, and the main results are: (1) The observed Ly$α$ spectra require relatively few clumps per line-of-sight as they have significant fluxes at the line center; (2) The velocity dispersion of the clumps yields a significant broadening of the spectra as observed; (3) The clump bulk outflow can also cause additional broadening if the HI in the ICM is optically thick; (4) The HI in the ICM is responsible for the absorption feature close to the Ly$α$ line center.

preprint2020arXiv

SN2019dge: a Helium-rich Ultra-Stripped Envelope Supernova

We present observations of ZTF18abfcmjw (SN2019dge), a helium-rich supernova with a fast-evolving light curve indicating an extremely low ejecta mass ($\approx 0.3\,M_\odot$) and low kinetic energy ($\approx 1.2\times 10^{50}\,{\rm erg}$). Early-time (<4 d after explosion) photometry reveal evidence of shock cooling from an extended helium-rich envelope of $\sim0.1\,M_\odot$ located at $\sim 3\times 10^{12}\,{\rm cm}$ from the progenitor. Early-time He II line emission and subsequent spectra show signatures of interaction with helium-rich circumstellar material, which extends from $\gtrsim 5\times 10^{13}\,{\rm cm}$ to $\gtrsim 2\times 10^{16}\,{\rm cm}$. We interpret SN2019dge as a helium-rich supernova from an ultra-stripped progenitor, which originates from a close binary system consisting of a mass-losing helium star and a low-mass main sequence star or a compact object (i.e., a white dwarf, a neutron star, or a black hole). We infer that the local volumetric birth rate of 19dge-like ultra-stripped SNe is in the range of 1400--8200$\,{\rm Gpc^{-3}\, yr^{-1}}$ (i.e., 2--12% of core-collapse supernova rate). This can be compared to the observed coalescence rate of compact neutron star binaries that are not formed by dynamical capture.

preprint2019arXiv

On the Survival of Cool Clouds in the Circum-Galactic Medium

We explore the survival of cool clouds in multi-phase circum-galactic media. We revisit the &#34;cloud crushing problem&#34; in a large survey of simulations including radiative cooling, self-shielding, self-gravity, magnetic fields, and anisotropic Braginskii conduction and viscosity (with saturation). We explore a wide range of parameters including cloud size, velocity, ambient temperature and density, as well as a variety of magnetic field configurations and cloud turbulence. We find that realistic magnetic fields and turbulence have weaker effects on cloud survival; the most important physics is radiative cooling and conduction. Self-gravity and self-shielding are important for clouds which are initially Jeans-unstable, but largely irrelevant otherwise. Non-self-gravitating, realistically magnetized clouds separate into four regimes: (1) At low column densities, clouds evaporate rapidly via conduction. (2) A &#34;failed pressure confinement&#34; regime, where the ambient hot gas cools too rapidly to provide pressure confinement for the cloud. (3) An &#34;infinitely long-lived&#34; regime, in which the cloud lifetime becomes longer than the cooling time of gas swept up in the leading bow shock, so the cloud begins to accrete and grow. (4) A &#34;classical cloud destruction&#34; regime, where clouds are eventually destroyed by instabilities. In the final regime, the cloud lifetime can exceed the naive cloud-crushing time owing to conduction-induced compression. However, small and/or slow-moving clouds can also evaporate more rapidly than the cloud-crushing time. We develop simple analytic models that explain the simulated cloud destruction times in this regime.