Researcher profile

Yu Lei

Yu Lei contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

From Context to Skills: Can Language Models Learn from Context Skillfully?

Many real-world tasks require language models (LMs) to reason over complex contexts that exceed their parametric knowledge. This calls for context learning, where LMs directly learn relevant knowledge from the given context. An intuitive solution is inference-time skill augmentation: extracting the rules and procedures from context into natural-language skills. However, constructing such skills for context learning scenarios faces two challenges: the prohibitive cost of manual skill annotation for long, technically dense contexts, and the lack of external feedback for automated skill construction. In this paper, we propose Ctx2Skill, a self-evolving framework that autonomously discovers, refines, and selects context-specific skills without human supervision or external feedback. At its core, a multi-agent self-play loop has a Challenger that generates probing tasks and rubrics, a Reasoner that attempts to solve them guided by an evolving skill set, and a neutral Judge that provides binary feedback. Crucially, both the Challenger and the Reasoner evolve through accumulated skills: dedicated Proposer and Generator agents analyze failure cases and synthesize them into targeted skill updates for both sides, enabling automated skill discovery and refinement. To prevent adversarial collapse caused by increasingly extreme task generation and over-specialized skill accumulation, we further introduce a Cross-time Replay mechanism that identifies the skill set achieving the best balance across representative cases for the Reasoner side, ensuring robust and generalizable skill evolution. The resulting skills can be plugged into any language model to obtain better context learning capability. Evaluated on four context learning tasks from CL-bench, Ctx2Skill consistently improves solving rates across backbone models.

preprint2023arXiv

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs). However, a notable challenge in RLHF is overoptimization, where beyond a certain threshold, the pursuit of higher rewards leads to a decline in human preferences. In this paper, we observe the weakness of KL regularization which is commonly employed in existing RLHF methods to address overoptimization. To mitigate this limitation, we scrutinize the RLHF objective in the offline dataset and propose uncertainty-penalized RLHF (UP-RLHF), which incorporates uncertainty regularization during RL-finetuning. To enhance the uncertainty quantification abilities for reward models, we first propose a diverse low-rank adaptation (LoRA) ensemble by maximizing the nuclear norm of LoRA matrix concatenations. Then we optimize policy models utilizing penalized rewards, determined by both rewards and uncertainties provided by the diverse reward LoRA ensembles. Our experimental results, based on two real human preference datasets, showcase the effectiveness of diverse reward LoRA ensembles in quantifying reward uncertainty. Additionally, uncertainty regularization in UP-RLHF proves to be pivotal in mitigating overoptimization, thereby contributing to the overall performance.

preprint2021arXiv

Capsule Graph Neural Networks with EM Routing

To effectively classify graph instances, graph neural networks need to have the capability to capture the part-whole relationship existing in a graph. A capsule is a group of neurons representing complicated properties of entities, which has shown its advantages in traditional convolutional neural networks. This paper proposed novel Capsule Graph Neural Networks that use the EM routing mechanism (CapsGNNEM) to generate high-quality graph embeddings. Experimental results on a number of real-world graph datasets demonstrate that the proposed CapsGNNEM outperforms nine state-of-the-art models in graph classification tasks.

preprint2020arXiv

A 20-Gbps Beam-steered Infrared Wireless Link Enabled by a Passively Field-programmable Metasurface

Beam steering is one of the main challenges in energy-efficient and high-speed infrared light communication. To date, active beam-steering schemes based on a spatial light modulator (SLM) or micro-electrical mechanical system (MEMS) mirror, as well as the passive ones based on diffractive gratings, have been demonstrated for infrared light communication. Here, for the first time to our knowledge, an infrared beam is steered by 35° on one side empowered by a passively field-programmable metasurface. By combining the centralized control of wavelength and polarization, a remote passive metasurface can steer the infrared beam in a remote access point. The proposed system keeps scalability to support multiple beams, flexibility to steer the beam, high optical efficiency, simple and cheap devices on remote sides, and centralized control (low maintenance cost), while it avoids disadvantages such as grating loss, a small coverage area, and a bulky size. Based on the proposed beam-steering technology, we also demonstrated a proof-of-concept experiment system with a data rate of 20 Gbps.

preprint2020arXiv

Geom-GCN: Geometric Graph Convolutional Networks

Message-passing neural networks (MPNNs) have been successfully applied to representation learning on graphs in a variety of real-world applications. However, two fundamental weaknesses of MPNNs' aggregators limit their ability to represent graph-structured data: losing the structural information of nodes in neighborhoods and lacking the ability to capture long-range dependencies in disassortative graphs. Few studies have noticed the weaknesses from different perspectives. From the observations on classical neural network and network geometry, we propose a novel geometric aggregation scheme for graph neural networks to overcome the two weaknesses. The behind basic idea is the aggregation on a graph can benefit from a continuous space underlying the graph. The proposed aggregation scheme is permutation-invariant and consists of three modules, node embedding, structural neighborhood, and bi-level aggregation. We also present an implementation of the scheme in graph convolutional networks, termed Geom-GCN (Geometric Graph Convolutional Networks), to perform transductive learning on graphs. Experimental results show the proposed Geom-GCN achieved state-of-the-art performance on a wide range of open datasets of graphs. Code is available at https://github.com/graphdml-uiuc-jlu/geom-gcn.

preprint2020arXiv

Monolayer Vanadium-doped Tungsten Disulfide: A Room-Temperature Dilute Magnetic Semiconductor

Dilute magnetic semiconductors, achieved through substitutional doping of spin-polarized transition metals into semiconducting systems, enable experimental modulation of spin dynamics in ways that hold great promise for novel magneto-electric or magneto-optical devices, especially for two-dimensional systems such as transition metal dichalcogenides that accentuate interactions and activate valley degrees of freedom. Practical applications of 2D magnetism will likely require room-temperature operation, air stability, and (for magnetic semiconductors) the ability to achieve optimal doping levels without dopant aggregation. Here we describe room-temperature ferromagnetic order obtained in semiconducting vanadium-doped tungsten disulfide monolayers produced by a reliable single-step film sulfidation method across an exceptionally wide range of vanadium concentrations, up to 12 at% with minimal dopant aggregation. These monolayers develop p-type transport as a function of vanadium incorporation and rapidly reach ambipolarity. Ferromagnetism peaks at an intermediate vanadium concentration of a few atomic percent and decreases for higher concentrations, which is consistent with quenching due to orbital hybridization at closer vanadium-vanadium spacings, as supported by transmission electron microscopy, magnetometry and first-principles calculations. Room-temperature two-dimensional dilute magnetic semiconductors provide a new component to expand the functional scope of van der Waals heterostructures and bring semiconducting magnetic 2D heterostructures them into the realm of practical application.

preprint2020arXiv

Satellite-based estimates of decline and rebound in China's CO$_2$ emissions during COVID-19 pandemic

Changes in CO$_2$ emissions during the COVID-19 pandemic have been estimated from indicators on activities like transportation and electricity generation. Here, we instead use satellite observations together with bottom-up information to track the daily dynamics of CO$_2$ emissions during the pandemic. Unlike activity data, our observation-based analysis can be independently evaluated and can provide more detailed insights into spatially-explicit changes. Specifically, we use TROPOMI observations of NO$_2$ to deduce ten-day moving averages of NO$_x$ and CO$_2$ emissions over China, differentiating emissions by sector and province. Between January and April 2020, China's CO$_2$ emissions fell by 11.5% compared to the same period in 2019, but emissions have since rebounded to pre-pandemic levels owing to the fast economic recovery in provinces where industrial activity is concentrated.