Source author record

Yong Mao

Yong Mao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence physics.class-ph Biological Physics Computation and Language Computer Vision cond-mat.soft cond-mat.stat-mech Machine Learning Multiagent Systems nlin.AO physics.flu-dyn physics.soc-ph

Catalog footprint

What is connected

6works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Agentic reinforcement learning (RL) holds great promise for the development of autonomous agents under complex GUI tasks, but its scalability remains severely hampered by the verification of task completion. Existing task verification is treated as a passive, post-hoc process: a verifier (i.e., rule-based scoring script, reward or critic model, and LLM-as-a-Judge) analyzes the agent's entire interaction trajectory to determine if the agent succeeds. Such processing of verbose context that contains irrelevant, noisy history poses challenges to the verification protocols and therefore leads to prohibitive cost and low reliability. To overcome this bottleneck, we propose SmartSnap, a paradigm shift from this passive, post-hoc verification to proactive, in-situ self-verification by the agent itself. We introduce the Self-Verifying Agent, a new type of agent designed with dual missions: to not only complete a task but also to prove its accomplishment with curated snapshot evidences. Guided by our proposed 3C Principles (Completeness, Conciseness, and Creativity), the agent leverages its accessibility to the online environment to perform self-verification on a minimal, decisive set of snapshots. Such evidences are provided as the sole materials for a general LLM-as-a-Judge verifier to determine their validity and relevance. Experiments on mobile tasks across model families and scales demonstrate that our SmartSnap paradigm allows training LLM-driven agents in a scalable manner, bringing performance gains up to 26.08% and 16.66% respectively to 8B and 30B models. The synergizing between solution finding and evidence seeking facilitates the cultivation of efficient, self-verifying agents with competitive performance against DeepSeek V3.1 and Qwen3-235B-A22B. Code is available at: https://github.com/TencentYoutuResearch/SmartSnap

preprint2025arXiv

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without expensive fine-tuning. To address these issues, we propose \textbf{Youtu-Agent}, a modular framework designed for the automated generation and continuous evolution of LLM agents. Youtu-Agent features a structured configuration system that decouples execution environments, toolkits, and context management, enabling flexible reuse and automated synthesis. We introduce two generation paradigms: a \textbf{Workflow} mode for standard tasks and a \textbf{Meta-Agent} mode for complex, non-standard requirements, capable of automatically generating tool code, prompts, and configurations. Furthermore, Youtu-Agent establishes a hybrid policy optimization system: (1) an \textbf{Agent Practice} module that enables agents to accumulate experience and improve performance through in-context optimization without parameter updates; and (2) an \textbf{Agent RL} module that integrates with distributed training frameworks to enable scalable and stable reinforcement learning of any Youtu-Agents in an end-to-end, large-scale manner. Experiments demonstrate that Youtu-Agent achieves state-of-the-art performance on WebWalkerQA (71.47\%) and GAIA (72.8\%) using open-weight models. Our automated generation pipeline achieves over 81\% tool synthesis success rate, while the Practice module improves performance on AIME 2024/2025 by +2.7\% and +5.4\% respectively. Moreover, our Agent RL training achieves 40\% speedup with steady performance improvement on 7B LLMs, enhancing coding/reasoning and searching capabilities respectively up to 35\% and 21\% on Maths and general/multi-hop QA benchmarks.

preprint2016arXiv

Optimal counter-current exchange networks

We present a general analysis of exchange devices linking their efficiency to the geometry of the exchange surface and supply network. For certain parameter ranges, we show that the optimal exchanger consists of densely packed pipes which can span a thin sheet of large area (an `active layer'), which may be crumpled into a fractal surface and supplied with a fractal network of pipes. We derive the efficiencies of such exchangers, showing the potential for significant gains compared to regular exchangers (where the active layer is flat), using parameters relevant for biological systems.

preprint2015arXiv

Does good memory help you win games?

We present a simple game model where agents with different memory lengths compete for finite resources. We show by simulation and analytically that an instability exists at a critical memory length, and as a result, different memory lengths can compete and co-exist in a dynamical equilibrium. Our analytical formulation makes a connection to statistical urn models, and we show that temperature is mirrored by the agent's memory. Our analysis is easily generalisable to many other game models with implications that we briefly discuss.

preprint2013arXiv

Optimisation of fractal spaceframes under gentle compressive load

The principle of hierarchical design is a prominent theme in many natural systems where mechanical efficiency is of importance. Here we establish the properties of a particular hierarchical structure, showing that high mechanical efficiency is found in certain loading regimes. We show that in the limit of gentle loading, the optimal hierarchical order increases without bound. We show that the scaling of material required for stability against loading to be withstood can be altered in a systematic, beneficial manner through manipulation of the number of structural length scales optimised upon. We establish the relationship between the Hausdorff dimension of the optimal structure and loading for which the structure is optimised. Practicalities of fabrication are discussed and examples of hierarchical frames of the same geometry constructed from solid beams are shown.

preprint2010arXiv

Bifurcations in the optimal elastic foundation for a buckling column

We investigate the buckling under compression of a slender beam with a distributed lateral elastic support, for which there is an associated cost. For a given cost, we study the optimal choice of support to protect against Euler buckling. We show that with only weak lateral support, the optimum distribution is a delta-function at the centre of the beam. When more support is allowed, we find numerically that the optimal distribution undergoes a series of bifurcations. We obtain analytical expressions for the buckling load around the first bifurcation point and corresponding expansions for the optimal position of support. Our theoretical predictions, including the critical exponent of the bifurcation, are confirmed by computer simulations.