Researcher profile

Liu Kang

Liu Kang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

Group Relative Policy Optimization (GRPO) has emerged as a promising critic-free reinforcement learning paradigm for reasoning tasks. However, standard GRPO employs a coarse-grained credit assignment mechanism that propagates group-level rewards uniformly to to every token in a sequence, neglecting the varying contribution of individual reasoning steps. We address this limitation by introducing Outcome-grounded Advantage Reshaping (OAR), a fine-grained credit assignment mechanism that redistributes advantages based on how much each token influences the model's final answer. We instantiate OAR via two complementary strategies: (1) OAR-P, which estimates outcome sensitivity through counterfactual token perturbations, serving as a high-fidelity attribution signal; (2) OAR-G, which uses an input-gradient sensitivity proxy to approximate the influence signal with a single backward pass. These importance signals are integrated with a conservative Bi-Level advantage reshaping scheme that suppresses low-impact tokens and boosts pivotal ones while preserving the overall advantage mass. Empirical results on extensive mathematical reasoning benchmarks demonstrate that while OAR-P sets the performance upper bound, OAR-G achieves comparable gains with negligible computational overhead, both significantly outperforming a strong GRPO baseline, pushing the boundaries of critic-free LLM reasoning.

preprint2020arXiv

Detection and characterisation of oscillating red giants: first results from the TESS satellite

Since the onset of the `space revolution' of high-precision high-cadence photometry, asteroseismology has been demonstrated as a powerful tool for informing Galactic archaeology investigations. The launch of the NASA TESS mission has enabled seismic-based inferences to go full sky -- providing a clear advantage for large ensemble studies of the different Milky Way components. Here we demonstrate its potential for investigating the Galaxy by carrying out the first asteroseismic ensemble study of red giant stars observed by TESS. We use a sample of 25 stars for which we measure their global asteroseimic observables and estimate their fundamental stellar properties, such as radius, mass, and age. Significant improvements are seen in the uncertainties of our estimates when combining seismic observables from TESS with astrometric measurements from the Gaia mission compared to when the seismology and astrometry are applied separately. Specifically, when combined we show that stellar radii can be determined to a precision of a few percent, masses to 5-10% and ages to the 20% level. This is comparable to the precision typically obtained using end-of-mission Kepler data