Researcher profile

Ziwei Li

Ziwei Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Gradient Starvation in Binary-Reward GRPO: Why Group-Mean Centering Fails and Why the Simplest Fix Works

Group Relative Policy Optimization (GRPO) is a standard algorithm for reinforcement learning from verifiable rewards, but its group-mean-centered advantage can fail under binary rewards. The failure mode is gradient starvation: when every response in a group is correct or every response is wrong, the centered advantage is exactly zero and the policy receives no learning signal. We prove that the true degeneracy rate always exceeds the i.i.d. Bernoulli prediction by Jensen's inequality, and observe a 0.69 degeneracy rate at group size four in logged Qwen3.5-9B GSM8K training. We then show that the fixed-reference Sign advantage, $A=2r-1$, performs pass@$G$ failure descent by increasing the probability that at least one sample in the group succeeds. On the full GSM8K test set across seven seeds, Sign reaches 73.8% accuracy versus 28.4% for standard normalized group-mean DrGRPO at group size four, a 45.4 point gain with $p<0.0001$. The effect is directionally consistent on Llama-3.1-8B and positive but underpowered on a MATH-500 transfer check. Pass@$k$ analysis indicates that the main benefit is search compression rather than large capacity expansion, aligning the empirical gains with recent RLVR ceiling observations.

preprint2021arXiv

Tropical precipitation clusters as islands on a rough water-vapor topography

Tropical precipitation clusters exhibit power-law frequency distributions in area and volume (integrated precipitation), implying a lack of characteristic scale in tropical convective organization. However, it remains unknown what gives rise to the power laws and how the power-law exponents for area and volume are related to one another. Here, we explore the perspective that precipitation clusters are islands above a convective threshold on a rough column-water-vapor (CWV) topography. This perspective is supported by the agreement between the precipitation clusters and CWV islands in their frequency distributions as well as fractal dimensions. Power laws exist for CWV islands at different thresholds through the CWV topography, suggesting that the existence of power-laws is not specifically related to local precipitation dynamics, but is rather a general feature of CWV islands. Furthermore, the frequency distributions and fractal dimensions of the clusters can be reproduced when the CWV field is modeled to be self-affine with a roughness exponent of 0.3. Self-affine scaling theory relates the statistics of precipitation clusters to the roughness exponent; it also relates the power-law slopes for area and volume without involving the roughness exponent. Thus, the perspective of precipitation clusters as CWV islands provides a useful framework to consider many statistical properties of the precipitation clusters, particularly given that CWV is well-observed over a wide range of length scales in the tropics. However, the statistics of CWV islands at the convective threshold imply a smaller roughness than is inferred from the power spectrum of the bulk CWV field, and further work is needed to understand the scaling of the CWV field.

preprint2020arXiv

Deep Learning for Strong Lensing Search: Tests of the Convolutional Neural Networks and New Candidates from KiDS DR3

Convolutional Neutral Networks have been successfully applied in searching for strong lensing systems, leading to discoveries of new candidates from large surveys. On the other hand, systematic investigations about their robustness are still lacking. In this paper, we first construct a neutral network, and apply it to $r$-band images of Luminous Red Galaxies (LRGs) of the Kilo Degree Survey (KiDS) Data Release 3 to search for strong lensing systems. We build two sets of training samples, one fully from simulations, and the other one using the LRG stamps from KiDS observations as the foreground lens images. With the former training sample, we find 48 high probability candidates after human-inspection, and among them, 27 are newly identified. Using the latter training set, about 67\% of the aforementioned 48 candidates are also found, and there are 11 more new strong lensing candidates identified. We then carry out tests on the robustness of the network performance with respect to the variation of PSF. With the testing samples constructed using PSF in the range of 0.4 to 2 times of the median PSF of the training sample, we find that our network performs rather stable, and the degradation is small. We also investigate how the volume of the training set can affect our network performance by varying it from 0.1 millions to 0.8 millions. The output results are rather stable showing that within the considered range, our network performance is not very sensitive to the volume size.

preprint2020arXiv

Response of Vertical Velocities in Extratropical Precipitation Extremes to Climate Change

Precipitation extremes intensify in most regions in climate-model projections. Changes in vertical velocities contribute to the changes in intensity of precipitation extremes but remain poorly understood. Here, we find that mid-tropospheric vertical velocities in extratropical precipitation extremes strengthen overall in simulations of 21st-century climate change. For each extreme event, we solve the quasi-geostrophic omega equation to decompose this strengthening into different physical contributions. We first consider a dry decomposition in which latent heating is treated as an external forcing of upward motion. Much of the positive contribution to upward motion from increased latent heating is offset by negative contributions from increases in dry static stability and changes in the horizontal length scale of vertical velocities. However, taking changes in latent heating as given is a limitation when the aim is to understand changes in precipitation, since latent heating and precipitation are closely linked. Therefore, we also perform a moist decomposition of the changes in vertical velocities in which latent heating is represented through a moist static stability. In the moist decomposition, changes in moist static stability play a key role and contributions from other factors such as changes in the depth of the upward motion increase in importance. While both dry and moist decompositions are self-consistent, the moist dynamical perspective has greater potential to give insights into the causes of the dynamical contributions to changes in precipitation extremes in different regions.