Paper detail

Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them

Agentic search requires large language models (LLMs) to perform multi-step search to solve complex information-seeking tasks, imposing unique challenges on their reasoning capabilities. However, what constitutes effective reasoning for agentic search and how it can be learned remains unclear. In this work, we first investigate the reasoning behaviors that enable success in agentic search. By comparing successful and failed trajectories via an LLM-based analysis pipeline, we identify four beneficial behaviors: Information Verification, Authority Evaluation, Adaptive Search, and Error Recovery. Building on this, we propose Behavior Priming, a training approach that equips agentic search models with these reasoning behaviors before reinforcement learning (RL). Specifically, it first performs supervised fine-tuning (SFT) on collected trajectories exhibiting the identified behaviors to cultivate these behaviors, and then applies standard RL to further improve task performance. Experiments on Qwen3-1.7B and Llama3.2-3B-Instruct show that Behavior Priming yields relative improvements over direct RL by 37.2\% on three web benchmarks and 6.2\% on seven multi-hop QA benchmarks, and outperforms the SFT-then-RL baseline using outcome-correct trajectories for fine-tuning. Crucially, we show that these reasoning behaviors matter more than outcome correctness in the priming stage prior to RL. Further analysis reveals that Behavior Priming enhances exploration (pass@8) and test-time scaling (search step number), providing a robust foundation for RL. Our code are avalible at https://github.com/cxcscmu/Behavior-Priming-for-Agentic-Search.

preprint2026arXivOpen access

Signal facts

What is known right now

Open access3 authors2 topics

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.