Source author record

Qingyun Wu

Qingyun Wu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci cond-mat.mes-hall Artificial Intelligence Machine Learning physics.app-ph Distributed, Parallel, and Cluster Computing Information Retrieval Social and Information Networks

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse tasks but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledge domains, or dynamic interaction contexts. As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck, necessitating agents that can adaptively reason, act, and evolve in real time. This paradigm shift -- from scaling static models to developing self-evolving agents -- has sparked growing interest in architectures and methods enabling continual learning and adaptation from data, interactions, and experiences. This survey provides the first systematic and comprehensive review of self-evolving agents, organizing the field around three foundational dimensions: what, when, and how to evolve. We examine evolutionary mechanisms across agent components (e.g., models, memory, tools, architecture), categorize adaptation methods by stages (e.g., intra-test-time, inter-test-time), and analyze the algorithmic and architectural designs that guide evolutionary adaptation (e.g., scalar rewards, textual feedback, single-agent and multi-agent systems). Additionally, we analyze evaluation metrics and benchmarks tailored for self-evolving agents, highlight applications in domains such as coding, education, and healthcare, and identify critical challenges and research directions in safety, scalability, and co-evolutionary dynamics. By providing a structured framework for understanding and designing self-evolving agents, this survey establishes a roadmap for advancing more adaptive, robust, and versatile agentic systems in both research and real-world deployments, and ultimately sheds light on the realization of Artificial Super Intelligence (ASI) where agents evolve autonomously and perform beyond human-level intelligence across tasks.

preprint2026arXiv

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

We argue that multi-agent test-time evolution is not single-agent evolution replicated N times. A single-agent learner can only evolve its own context and memory. A multi-agent system additionally evolves who collaborates, how they collaborate, and how knowledge flows across the population. These components have no single-agent counterpart and can produce phenomena such as emergent specialization. Yet prior test-time methods either confine experiences to individual agents, forfeiting cross-agent learning, or broadcast symmetrically to all agents, erasing the specialization that makes collaboration valuable. We present EVOCHAMBER, a training-free framework that instantiates test-time evolution at three levels over a coevolving agent pool. At its core is CODREAM (Collaborative Dreaming), a post-task protocol triggered on team failure or disagreement, in which agents collaboratively reflect, distill insights, and route them asymmetrically from strong to weak agents on the failed niche, preserving specialization while filling knowledge gaps. Team-level operators assemble niche-conditioned teams and select collaboration structures online. Population-level lifecycle operators fork, merge, prune, and seed agents under performance pressure. On three heterogeneous task streams with Qwen3-8B, EVOCHAMBER reaches 63.9% on competition math, 75.7% on code, and 87.1% on multi-domain reasoning, outperforming the best baseline by 32% relative on math and confirming asymmetric cross-agent transfer as the primary driver in ablation. Starting from several identically initialized agents, four to five stable niche specialists spontaneously emerge, a structural signature of multi-agent evolution that no single-agent learner can express. See our code at: https://github.com/Mercury7353/EvoChamber

preprint2026arXiv

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Automatic multi-agent systems aim to instantiate agent workflows without relying on manually designed or fixed orchestration. However, existing automatic MAS approaches remain only partially adaptive: they either perform training-free test-time search or optimize the meta-level designer while keeping downstream execution agents frozen, which creating a frozen-executor ceiling and leaving the end-to-end training of self-designing and self-executing agentic models unexplored. To address this, we introduce MetaAgent-X, an end-to-end reinforcement learning framework that jointly optimizes automatic MAS design and execution. MetaAgent-X enables script-based MAS generation, execution rollout collection, and credit assignment for both designer and executor trajectories. To support stable and scalable optimization, we propose Executor Designer Hierarchical Rollout and Stagewise Co-evolution to improve training stability and expose the dynamics of designer-executor co-evolution. MetaAgent-X consistently outperforms existing automatic MAS baselines, achieving up to 21.7% gains. Comprehensive ablations show that both designer and executor improve throughout training, and that effective automatic MAS learning follows a stagewise co-evolution process. These results establish end-to-end trainable automatic MAS as a practical paradigm for building self-designing and self-executing agentic models.

preprint2022arXiv

Cataloguing MoSi$_2$N$_4$ and WSi$_2$N$_4$ van der Waals Heterostructures: An Exceptional Material Platform for Excitonic Solar Cell Applications

Two-dimensional (2D) materials van der Waals heterostructures (vdWHs) provides a revolutionary route towards high-performance solar energy conversion devices beyond the conventional silicon-based pn junction solar cells. Despite tremendous research progress accomplished in recent years, the searches of vdWHs with exceptional excitonic solar cell conversion efficiency and optical properties remain an open theoretical and experimental quest. Here we show that the vdWH family composed of MoSi$_2$N$_4$ and WSi$_2$N$_4$ monolayers provides a compelling material platform for developing high-performance ultrathin excitonic solar cells and photonics devices. Using first-principle calculations, we construct and classify 51 types of MoSi$_2$N$_4$ and WSi$_2$N$_4$-based [(Mo,W)Si$_2$N$_4$] vdWHs composed of various metallic, semimetallic, semiconducting, insulating and topological 2D materials. Intriguingly, MoSi$_2$N$_4$/(InSe, WSe$_2$) are identified as Type-II vdWHs with exceptional excitonic solar cell power conversion efficiency reaching well over 20%, which are competitive to state-of-art silicon solar cells. The (Mo,W)Si$_2$N$_4$ vdWH family exhibits strong optical absorption in both the visible and ultraviolet regimes. Exceedingly large peak ultraviolet absorptions over 40%, approaching the maximum absorption limit of a free-standing 2D material, can be achieved in (Mo,W)Si$_2$N$_4$/$α_2$-(Mo,W)Ge$_2$P$_4$ vdWHs. Our findings unravel the enormous potential of (Mo,W)Si$_2$N$_4$ vdWHs in designing ultimately compact excitonic solar cell device technology.

preprint2022arXiv

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Online influence maximization aims to maximize the influence spread of a content in a social network with unknown network model by selecting a few seed nodes. Recent studies followed a non-adaptive setting, where the seed nodes are selected before the start of the diffusion process and network parameters are updated when the diffusion stops. We consider an adaptive version of content-dependent online influence maximization problem where the seed nodes are sequentially activated based on real-time feedback. In this paper, we formulate the problem as an infinite-horizon discounted MDP under a linear diffusion process and present a model-based reinforcement learning solution. Our algorithm maintains a network model estimate and selects seed users adaptively, exploring the social network while improving the optimal policy optimistically. We establish $\widetilde O(\sqrt{T})$ regret bound for our algorithm. Empirical evaluations on synthetic network demonstrate the efficiency of our algorithm.

preprint2022arXiv

Tunable electronic properties and band alignments of MoSi$_2$N$_4$/GaN and MoSi$_2$N$_4$/ZnO van der Waals heterostructures

Van de Waals heterostructures (VDWH) is an emerging strategy to engineer the electronic properties of two-dimensional (2D) material systems. Motivated by the recent discovery of MoSi$_2$N$_4$ - a synthetic septuple-layered 2D semiconductor with exceptional mechanical and electronic properties, we investigate the synergy of \ce{MoSi2N4} with wide band gap (WBG) 2D monolayers of GaN and ZnO using first-principle calculations. We find that MoSi$_2$N$_4$/GaN is a direct band gap Type-I VDWH while MoSi$_2$N$_4$/ZnO is an indirect band gap Type-II VDWH. Intriguingly, by applying an electric field or mechanical strain along the out-of-plane direction, the band structures of MoSi$_2$N$_4$/GaN and MoSi$_2$N$_4$/ZnO can be substantially modified, exhibiting rich transitional behaviors, such as the Type-I-to-Type-II band alignment and the direct-to-indirect band gap transitions. These findings reveal the potentials of MoSi$_2$N$_4$-based WBG VDWH as a tunable hybrid materials with enormous design flexibility in ultracompact optoelectronic applications.

preprint2020arXiv

Electrical Contact between an Ultrathin Topological Dirac Semimetal and a Two-Dimensional Material

Ultrathin films of topological Dirac semimetal, Na$_3$Bi, has recently been revealed as an unusual electronic materials with field-tunable topological phases. Here we investigate the electronic and transport properties of ultrathin Na$_3$Bi as an electrical contact to two-dimensional (2D) metal, i.e. graphene, and 2D semiconductor, i.e. MoS$_2$ and WS$_2$ monolayers. Using combined first-principle density functional theory and nonequilibrium Green's function simulation, we show that the electrical coupling between Na$_3$Bi bilayer thin film and graphene results in a notable interlayer charge transfer, thus inducing sizable $n$-type doping in the Na$_3$Bi/graphene heterostructures. In the case of MoS$_2$ and WS$_2$ monolayers, the lateral Schottky transport barrier is significantly lower than many commonly studied bulk metals, thus unraveling Na$_3$Bi bilayer as a high-efficiency electrical contact material for 2D semiconductors. These findings opens up an avenue of utilizing topological semimetal thin film as electrical contact to 2D materials, and further expands the family of 2D heterostructure devices into the realm of topological materials.

preprint2020arXiv

Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems

Recommender systems are embracing conversational technologies to obtain user preferences dynamically, and to overcome inherent limitations of their static models. A successful Conversational Recommender System (CRS) requires proper handling of interactions between conversation and recommendation. We argue that three fundamental problems need to be solved: 1) what questions to ask regarding item attributes, 2) when to recommend items, and 3) how to adapt to the users' online feedback. To the best of our knowledge, there lacks a unified framework that addresses these problems. In this work, we fill this missing interaction framework gap by proposing a new CRS framework named Estimation-Action-Reflection, or EAR, which consists of three stages to better converse with users. (1) Estimation, which builds predictive models to estimate user preference on both items and item attributes; (2) Action, which learns a dialogue policy to determine whether to ask attributes or recommend items, based on Estimation stage and conversation history; and (3) Reflection, which updates the recommender model when a user rejects the recommendations made by the Action stage. We present two conversation scenarios on binary and enumerated questions, and conduct extensive experiments on two datasets from Yelp and LastFM, for each scenario, respectively. Our experiments demonstrate significant improvements over the state-of-the-art method CRM [32], corresponding to fewer conversation turns and a higher level of recommendation hits.

preprint2020arXiv

Fast Distributed Bandits for Online Recommendation Systems

Contextual bandit algorithms are commonly used in recommender systems, where content popularity can change rapidly. These algorithms continuously learn latent mappings between users and items, based on contexts associated with them both. Recent recommendation algorithms that learn clustering or social structures between users have exhibited higher recommendation accuracy. However, as the number of users and items in the environment increases, the time required to generate recommendations deteriorates significantly. As a result, these cannot be deployed in practice. The state-of-the-art distributed bandit algorithm - DCCB - relies on a peer-to-peer net-work to share information among distributed workers. However, this approach does not scale well with the increasing number of users. Furthermore, it suffers from slow discovery of clusters, resulting in accuracy degradation. To address the above issues, this paper proposes a novel distributed bandit-based algorithm called DistCLUB. This algorithm lazily creates clusters in a distributed manner, and dramatically reduces the network data sharing requirement, achieving high scalability. Additionally, DistCLUB finds clusters much faster, achieving better accuracy than the state-of-the-art algorithm. Evaluation over both real-world benchmarks and synthetic datasets shows that DistCLUB is on average 8.87x faster than DCCB, and achieves 14.5% higher normalized prediction performance.

preprint2020arXiv

Unifying Clustered and Non-stationary Bandits

Non-stationary bandits and online clustering of bandits lift the restrictive assumptions in contextual bandits and provide solutions to many important real-world scenarios. Though the essence in solving these two problems overlaps considerably, they have been studied independently. In this paper, we connect these two strands of bandit research under the notion of test of homogeneity, which seamlessly addresses change detection for non-stationary bandit and cluster identification for online clustering of bandit in a unified solution framework. Rigorous regret analysis and extensive empirical evaluations demonstrate the value of our proposed solution, especially its flexibility in handling various environment assumptions.

preprint2015arXiv

Electronic and Transport Property of Phosphorene Nanoribbon

By combining density functional theory and nonequilibrium Green's function, we study the electronic and transport properties of monolayer black phosphorus nanoribbons (PNRs). First, we investigate the band-gap of PNRs and its modulation by the ribbon width and an external transverse electric feld. Our calculations indicate a giant Stark effect in PNRs, which can switch on transport channels of semiconducting PNRs under low bias, inducing an insulator-metal-transition. Next, we study the transport channels in PNRs via the calculations of the current density and local electron transmission pathway. In contrast to graphene and MoS_2 nanoribbons, the carrier transport channels under low bias are mainly located in the interior of both armchair and zigzag PNRs, and immune to a small amount of edge defects. Lastly, a device of the PNR-based dual-gate feld-effect-transistor, with high on/off-ratio of 10^3, is proposed based on the giant electric feld tuning effect.

preprint2014arXiv

Efficient spin injection into graphene through a tunnel barrier: overcoming the spin conductance mismatch

Employing first-principles calculations, we investigate efficiency of spin injection from a ferromagnetic (FM) electrode (Ni) into graphene and possible enhancement by using a barrier between the electrode and graphene. Three types of barriers, h-BN, Cu(111), and graphite, of various thickness (0-3 layers) are considered and the electrically biased conductance of the Ni/Barrier/Graphene junction are calculated. It is found that the minority spin transport channel of graphene can be strongly suppressed by the insulating h-BN barrier, resulting in a high spin injection efficiency. On the other hand, the calculated spin injection efficiencies of Ni/Cu/Graphene and Ni/Graphite/Graphene junctions are low, due to the spin conductance mismatch. Further examination on the electronic structure of the system reveals that the high spin injection efficiency in the presence of a tunnel barrier is due to its asymmetric effects on the two spin states of graphene.

preprint2013arXiv

Thermal Stability and Electrical Control of Magnetization of Heusler/Oxide Interface and Non-collinear Spin Transport of Its Junction

Towards next-generation spintronics devices, such as computer memories and logic chips, it is necessary to satisfy high thermal stability, low-power consumption and high spin-polarization simultaneously. Here, from first-principles, we investigate thermal stability (both structure and magnetization) and the electric field control of magnetic anisotropy on Co2FeAl (CFA)/MgO. A phase diagram of structural thermal stability of the CFA/MgO interface is illustrated. An interfacial perpendicular-anisotropy, coming from the Fe-O orbital hybridization, provides high magnetic thermal stability and a low stray field. We find an electric-field-induced giant modification of such perpendicular-anisotropy via a great magnetoelectric effect (the anisotropy energy coefficient beta~10-7 erg/V cm). Our spin electronic-structure and non-collinear transport calculations indicate high spin-polarized interfacial states and good magnetoresistance properties of CFA/MgO/CFA perpendicular magnetic tunnel junctions.

Qingyun Wu

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Cataloguing MoSi$_2$N$_4$ and WSi$_2$N$_4$ van der Waals Heterostructures: An Exceptional Material Platform for Excitonic Solar Cell Applications

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Tunable electronic properties and band alignments of MoSi$_2$N$_4$/GaN and MoSi$_2$N$_4$/ZnO van der Waals heterostructures

Electrical Contact between an Ultrathin Topological Dirac Semimetal and a Two-Dimensional Material

Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems

Fast Distributed Bandits for Online Recommendation Systems

Unifying Clustered and Non-stationary Bandits

Electronic and Transport Property of Phosphorene Nanoribbon

Efficient spin injection into graphene through a tunnel barrier: overcoming the spin conductance mismatch

Thermal Stability and Electrical Control of Magnetization of Heusler/Oxide Interface and Non-collinear Spin Transport of Its Junction