Researcher profile

Yuan Sui

Yuan Sui contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering

When language model agents tackle complex software engineering tasks, they often degrade over long trajectories, which we define as *agent drift*. We focus on two recurring failure modes *overthinking* and *overacting*, i.e., where the agent repeatedly reasons over information it already has, and where it issues tool calls without integrating recent observations or acquiring new evidence. In this paper, we introduce TACT (Think-Act Calibration via activation Steering), to detect and mitigate agent drift in the residual stream before it surfaces as a behavioral failure. In specific, we label trajectory steps as overthinking, overacting, or calibrated, and find that their hidden states can separate linearly along two *drift axes*, pointing from calibrated behavior toward each failure mode (AUC $\approx$ 0.9). To mitigate agent drift, we project each step's activation onto these axes at test time and pull drifted ones back toward the calibrated region. Experiments show that TACT outperforms unsteered baselines across SWE-bench Verified, Terminal-Bench 2.0, and CLAW-Eval, lifting average resolve rate by $+5.8$ pp on Qwen3.5-27B and $+4.8$ pp on Gemma-4-26B-A4B-it while cutting steps-to-resolve by up to $26\%$. These gains frame agent drift as a steerable direction in the residual stream, and position TACT as a viable handle for reliable long-horizon agents.

preprint2026arXiv

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking

LLMs struggle with decision-making in high-stakes environments like MOBA games, primarily due to a lack of proactive reasoning and limited understanding of complex game dynamics. To address this, we propose What-if Analysis LLM (WiA-LLM), a framework that trains an LLM as an explicit, language-based world model. Instead of representing the environment in latent vectors, WiA-LLM uses natural language to simulate how the game state evolves over time in response to candidate actions, and provides textual justifications for these predicted outcomes. WiA-LLM is trained in two stages: supervised fine-tuning on human-like reasoning traces, followed by reinforcement learning with outcome-based rewards based on the alignment between predicted and actual future states. In the Honor of Kings (HoK) environment, WiA-LLM attains 74.2\% accuracy (27\%$\uparrow$ vs. base model) in forecasting game-state changes. In addition, WiA-LLM demonstrate strategic behavior more closely aligned with expert players than purely reactive LLMs, indicating enhanced foresight and expert-like decision-making.

preprint2022arXiv

Optimization simulation of reflow welding based on prediction of regional center temperature field

Before reflow soldering of integrated electronic products, the numerical simulation of temperature control curve of reflow furnace is crucial for selecting proper parameters and improving the overall efficiency of reflow soldering process and product quality. According to the heat conduction law and the specific heat capacity formula, the first-order ordinary differential equation of the central temperature curve of the welding area with respect to the temperature distribution function in the furnace on the conveyor belt displacement is obtained. For the gap with small temperature difference, the sigmoid function is used to obtain a smooth interval temperature transition curve; For the gap with large temperature difference, the linear combination of exponential function and primary function is used to approach the actual concave function, so as to obtain the complete temperature distribution function in the furnace. The welding parameters are obtained by solving the ordinary differential equation, and a set of optimal process parameters consistent with the process boundary are obtained by calculating the mean square error between the predicted temperature field and the real temperature distribution. At the same time, a set of reflow optimization strategies are designed for speed interval prediction strategy, minimum parameter interval prediction strategy, and the most symmetrical parameter interval prediction of solder paste melting reflow area. The simulation results show that the temperature field prediction results obtained by this method are highly consistent with the actual sensor data, and have strong correlation. This method can help to select appropriate process parameters, optimize the production process, reduce equipment commissioning practice and optimize the solder joint quality of production products.