Researcher profile

Wu Li

Wu Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Team-Based Self-Play With Dual Adaptive Weighting for Fine-Tuning LLMs

While recent self-training approaches have reduced reliance on human-labeled data for aligning LLMs, they still face critical limitations: (i) sensitivity to synthetic data quality, leading to instability and bias amplification in iterative training; (ii) ineffective optimization due to a diminishing gap between positive and negative responses over successive training iterations. In this paper, we propose Team-based self-Play with dual Adaptive Weighting (TPAW), a novel self-play algorithm designed to improve alignment in a fully self-supervised setting. TPAW adopts a team-based framework in which the current policy model both collaborates with and competes against historical checkpoints, promoting more stable and efficient optimization. To further enhance learning, we design two adaptive weighting mechanisms: (i) a response reweighting scheme that adjusts the importance of target responses, and (ii) a player weighting strategy that dynamically modulates each team member's contribution during training. Initialized from a SFT model, TPAW iteratively refines alignment without requiring additional human supervision. Experimental results demonstrate that TPAW consistently outperforms existing baselines across various base models and LLM benchmarks. Our code is publicly available at https://github.com/lab-klc/TPAW.

preprint2021arXiv

Raman Linewidth Contributions from Four-Phonon and Electron-Phonon Interactions in Graphene

The Raman peak position and linewidth provide insight into phonon anharmonicity and electron-phonon interactions (EPI) in materials. For monolayer graphene, prior first-principles calculations have yielded decreasing linewidth with increasing temperature, which is opposite to measurement results. Here, we explicitly consider four-phonon anharmonicity, phonon renormalization, and electron-phonon coupling, and find all to be important to successfully explain both the $G$ peak frequency shift and linewidths in our suspended graphene sample at a wide temperature range. Four-phonon scattering contributes a prominent linewidth that increases with temperature, while temperature dependence from EPI is found to be reversed above a doping threshold ($\hbarω_G/2$, with $ω_G$ being the frequency of the $G$ phonon).