Source author record

Yuhui Wang

Yuhui Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Machine Learning Computational Engineering, Finance, and Science eess.SY Multiagent Systems Systems and Control Computer Science and Game Theory Computer Vision Cryptography and Security Digital Libraries Distributed, Parallel, and Cluster Computing Information Retrieval Networking and Internet Architecture physics.app-ph physics.optics

Catalog footprint

What is connected

15works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

Evaluating large language models (LLMs) on natural-language logical reasoning is essential because rule-governed tasks require conclusions to follow strictly from stated premises. Many existing logical-reasoning benchmarks are generated by templating natural-language items from sampled formulas, provide only coarse or unaudited formal annotations, and are now quickly saturated by frontier reasoning models. We present LLMEval-Logic, a Chinese logical reasoning benchmark built from realistic situational scenarios. Its pipeline forward-authors and expert-audits natural-language items together with their reference formalizations, verifies annotated answers with Z3, constructs expert rubrics for natural-to-formal grading, and hardens selected items through a closed-loop adversarial workflow. The benchmark is released in two paired subsets: a 246-item Base subset shipped with 1,400 expert-developed rubric atoms, and a 190-item Hard subset with 938 multi-step sub-questions over closed model spaces. Evaluating 14 frontier LLMs on LLMEval-Logic reveals substantial gaps in current models: the best model reaches only 37.5% Hard Item Accuracy, and even with reference symbols the highest joint Z3+Rubric formalization score among evaluated models reaches only 60.16%. Our benchmark is publicly available at https://github.com/llmeval/LLMEval-Logic.

preprint2026arXiv

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

As large language model (LLM)-powered agents are increasingly deployed to perform complex, real-world tasks, they face a growing class of attacks that exploit extended user-agent-environment interactions to pursue malicious objectives improbable in single-turn settings. Such long-horizon threats pose significant risks to the safe deployment of LLM agents in critical domains. In this paper, we present MAGE (Memory As Guardrail Enforcement), a novel defensive framework designed to counter a wide range of long-horizon threats. Inspired by the "shadow stack" abstraction in systems security, MAGE maintains a dedicated, safety-focused agentic memory that distills and retains safety-critical context across the agent's full execution trajectory, leveraging this shadow memory to proactively assess the risk of pending actions prior to their execution. Extensive evaluation demonstrates that MAGE substantially outperforms existing defenses across diverse long-horizon threats in detection accuracy, achieves early-stage detection for the majority of attacks, and introduces only negligible overhead to agent utility. To our best knowledge, MAGE represents the first framework to detect and mitigate long-horizon threats using an agentic memory approach, establishing a new paradigm for this critical challenge and opening promising directions for future research.

preprint2026arXiv

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Evaluating novelty is critical yet challenging in peer review, as reviewers must assess submissions against a vast, rapidly evolving literature. This report presents OpenNovelty, an LLM-powered agentic system for transparent, evidence-based novelty analysis. The system operates through four phases: (1) extracting the core task and contribution claims to generate retrieval queries; (2) retrieving relevant prior work based on extracted queries via semantic search engine; (3) constructing a hierarchical taxonomy of core-task-related work and performing contribution-level full-text comparisons against each contribution; and (4) synthesizing all analyses into a structured novelty report with explicit citations and evidence snippets. Unlike naive LLM-based approaches, \textsc{OpenNovelty} grounds all assessments in retrieved real papers, ensuring verifiable judgments. We deploy our system on 500+ ICLR 2026 submissions with all reports publicly available on our website, and preliminary analysis suggests it can identify relevant prior work, including closely related papers that authors may overlook. OpenNovelty aims to empower the research community with a scalable tool that promotes fair, consistent, and evidence-backed peer review.

preprint2026arXiv

Wearable-informed generative digital avatars predict task-conditioned post-stroke locomotion

Dynamic prediction of locomotor capacity after stroke could enable more individualized rehabilitation, yet current assessments largely provide static impairment scores and do not indicate whether patients can perform specific tasks such as slope walking or stair climbing. Here, we present a wearable-informed data-physics hybrid generative framework that reconstructs a stroke survivor's locomotor control from wearable inertial sensing and predicts task-conditioned post-stroke locomotion in new environments. From a single 20 m level-ground walking trial recorded by five IMUs, the framework personalizes a physics-based digital avatar using a healthy-motion prior and hybrid imitation learning, generating dynamically feasible, patient-specific movements for inclined walking and stair negotiation. Across 11 stroke inpatients, predicted postures reached 82.2% similarity for slopes and 69.9% for stairs, substantially exceeding a physics-only baseline. In a multicentre pilot randomized study (n = 21; 28 days), access to scenario-specific locomotion predictions to support task selection and difficulty titration was associated with larger gains in Fugl-Meyer lower-extremity scores than standard care (mean change 6.0 vs 3.7 points; $p < 0.05$). These results suggest that wearable-informed generative digital avatars may augment individualized gait rehabilitation planning and provide a pathway toward dynamically personalized post-stroke motor recovery strategies.

preprint2022arXiv

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing high-level campaign objectives and constraints. Previous works designed auto-bidding tools from the view of single-agent, without modeling the mutual influence between agents. In this paper, we instead consider this problem from a distributed multi-agent perspective, and propose a general $\underline{M}$ulti-$\underline{A}$gent reinforcement learning framework for $\underline{A}$uto-$\underline{B}$idding, namely MAAB, to learn the auto-bidding strategies. First, we investigate the competition and cooperation relation among auto-bidding agents, and propose a temperature-regularized credit assignment to establish a mixed cooperative-competitive paradigm. By carefully making a competition and cooperation trade-off among agents, we can reach an equilibrium state that guarantees not only individual advertiser's utility but also the system performance (i.e., social welfare). Second, to avoid the potential collusion behaviors of bidding low prices underlying the cooperation, we further propose bar agents to set a personalized bidding bar for each agent, and then alleviate the revenue degradation due to the cooperation. Third, to deploy MAAB in the large-scale advertising system with millions of advertisers, we propose a mean-field approach. By grouping advertisers with the same objective as a mean auto-bidding agent, the interactions among the large-scale advertisers are greatly simplified, making it practical to train MAAB efficiently. Extensive experiments on the offline industrial dataset and Alibaba advertising platform demonstrate that our approach outperforms several baseline methods in terms of social welfare and revenue.

preprint2022arXiv

A SBP-SAT FDTD Subgridding Method Using Staggered Yee's Grids Without Modifying Field Components

A summation-by-parts simultaneous approximation term (SBP-SAT) finite-difference time-domain (FDTD) subgridding method is proposed to model geometrically fine structures in this paper. Compared with our previous work, the proposed SBP-SAT FDTD method uses the staggered Yee's grid without adding or modifying any field components through field extrapolation on the boundaries to make the discrete operators satisfy the SBP property. The accuracy of extrapolation keeps consistency with that of the second-order finite-difference scheme near the boundaries. In addition, the SATs are used to weakly enforce the tangential boundary conditions between multiple mesh blocks with different mesh sizes. With carefully designed interpolation matrices and selected free parameters of the SATs, no dissipation occurs in the whole computational domain. Therefore, its long-time stability is theoretically guaranteed. Three numerical examples are carried out to validate its effectiveness. Results show that the proposed SBP-SAT FDTD subgridding method is stable, accurate, efficient, and easy to implement based on existing FDTD codes with only a few modifications.

preprint2022arXiv

An Intellectual Property Entity Recognition Method Based on Transformer and Technological Word Information

Patent texts contain a large amount of entity information. Through named entity recognition, intellectual property entity information containing key information can be extracted from it, helping researchers to understand the patent content faster. Therefore, it is difficult for existing named entity extraction methods to make full use of the semantic information at the word level brought about by professional vocabulary changes. This paper proposes a method for extracting intellectual property entities based on Transformer and technical word information , and provides accurate word vector representation in combination with the BERT language method. In the process of word vector generation, the technical word information extracted by IDCNN is added to improve the understanding of intellectual property entities Representation ability. Finally, the Transformer encoder that introduces relative position encoding is used to learn the deep semantic information of the text from the sequence of word vectors, and realize entity label prediction. Experimental results on public datasets and annotated patent datasets show that the method improves the accuracy of entity recognition.

preprint2022arXiv

Proactive and Resilient UAV Orchestration for QoS Driven Connectivity and Coverage of Ground Users

Unmanned aerial vehicles (UAVs) are being successfully used to deliver communication services in applications such as extending the coverage of 5G cellular networks in remote areas, emergency situations, and enhancing the service quality in regions of dense user populations. While optimized placement solutions have been proposed in literature for ensuring quality of service (QoS) of users, they may not be ideal in highly mobile, autonomous, and diverse network scenarios. This paper proposes a proactive and resilient framework for distributed and dynamic orchestration of UAV small cells to provide QoS differentiation in the network. The UAV locations are tailored to end-user locations and service needs while ensuring that UAVs maintain localized backhaul connectivity. Simulation experiments show that under scenarios where physical placement can achieve service differentiation, the developed framework leads to a stable configurations of UAVs satisfying above 90% of user QoS requirements.

preprint2022arXiv

Research on Intellectual Property Resource Profile and Evolution Law

In the era of big data, intellectual property-oriented scientific and technological resources show the trend of large data scale, high information density and low value density, which brings severe challenges to the effective use of intellectual property resources, and the demand for mining hidden information in intellectual property is increasing. This makes intellectual property-oriented science and technology resource portraits and analysis of evolution become the current research hotspot. This paper sorts out the construction method of intellectual property resource intellectual portrait and its pre-work property entity extraction and entity completion from the aspects of algorithm classification and general process, and directions for improvement of future methods.

preprint2022arXiv

Resilient UAV Formation for Coverage and Connectivity of Spatially Dispersed Users

Unmanned aerial vehicles (UAVs) are a convenient choice for carrying mobile base stations to rapidly setup communication services for ground users. Unlike terrestrial networks, UAVs do not have fiber optic back-haul connectivity except when they are tethered to the ground, which restricts their mobility. In the absence of back-haul, e.g., in remote areas, emergency situations, or in battlefields, there is a need to ensure connectivity among UAVs in addition to coverage of ground users for creating local area networks. This paper provides a distributed and dynamic approach for UAV formation-based control for coverage and connectivity of spatially dispersed users. We use flocking dynamics as a guide to constructing tailored formations of UAVs on the fly. Simulation results demonstrate that if sufficient aerial base stations are available, the proposed approach results in a strongly connected network of UAVs that is able to provide both a backhaul and fronthaul network. The approach can be further extended to create multi-tier extra-terrestrial networks to cater for large-scale applications.

preprint2020arXiv

SMIX($λ$): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

Learning a stable and generalizable centralized value function (CVF) is a crucial but challenging task in multi-agent reinforcement learning (MARL), as it has to deal with the issue that the joint action space increases exponentially with the number of agents in such scenarios. This paper proposes an approach, named SMIX($λ$), to address the issue using an efficient off-policy centralized training method within a flexible learner search space. As importance sampling for such off-policy training is both computationally costly and numerically unstable, we proposed to use the $λ$-return as a proxy to compute the TD error. With this new loss function objective, we adopt a modified QMIX network structure as the base to train our model. By further connecting it with the ${Q(λ)}$ approach from an unified expectation correction viewpoint, we show that the proposed SMIX($λ$) is equivalent to ${Q(λ)}$ and hence shares its convergence properties, while without being suffered from the aforementioned curse of dimensionality problem inherent in MARL. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark demonstrate that our approach not only outperforms several state-of-the-art MARL methods by a large margin, but also can be used as a general tool to improve the overall performance of other CTDE-type algorithms by enhancing their CVFs.

preprint2020arXiv

The Limit of the Batch Size

Large-batch training is an efficient approach for current distributed deep learning systems. It has enabled researchers to reduce the ImageNet/ResNet-50 training from 29 hours to around 1 minute. In this paper, we focus on studying the limit of the batch size. We think it may provide a guidance to AI supercomputer and algorithm designers. We provide detailed numerical optimization instructions for step-by-step comparison. Moreover, it is important to understand the generalization and optimization performance of huge batch training. Hoffer et al. introduced "ultra-slow diffusion" theory to large-batch training. However, our experiments show contradictory results with the conclusion of Hoffer et al. We provide comprehensive experimental results and detailed analysis to study the limitations of batch size scaling and "ultra-slow diffusion" theory. For the first time we scale the batch size on ImageNet to at least a magnitude larger than all previous work, and provide detailed studies on the performance of many state-of-the-art optimization schemes under this setting. We propose an optimization recipe that is able to improve the top-1 test accuracy by 18% compared to the baseline.

preprint2020arXiv

Truly Proximal Policy Optimization

Proximal policy optimization (PPO) is one of the most successful deep reinforcement-learning methods, achieving state-of-the-art performance across a wide range of challenging tasks. However, its optimization behavior is still far from being fully understood. In this paper, we show that PPO could neither strictly restrict the likelihood ratio as it attempts to do nor enforce a well-defined trust region constraint, which means that it may still suffer from the risk of performance instability. To address this issue, we present an enhanced PPO method, named Truly PPO. Two critical improvements are made in our method: 1) it adopts a new clipping function to support a rollback behavior to restrict the difference between the new policy and the old one; 2) the triggering condition for clipping is replaced with a trust region-based one, such that optimizing the resulted surrogate objective function provides guaranteed monotonic improvement of the ultimate policy performance. It seems, by adhering more truly to making the algorithm proximal - confining the policy within the trust region, the new algorithm improves the original PPO on both sample efficiency and performance.

preprint2020arXiv

Z_2 Photonic topological insulators in the visible wavelength range for robust nanoscale photonics

Topological photonics provides an ideal platform for demonstrating novel band topology concepts, which are also promising for robust waveguiding, communication and computation applications. However, many challenges such as extremely large device footprint and functionality at short wavelengths remain to be solved which are required to make practical and useful devices that can also couple to electronic excitations in many important organic and inorganic semiconductors. In this letter, we report an experimental realization of Z_2 photonic topological insulators with their topological edge state energies spanning across the visible wavelength range including in the sub-500 nm regime. The photonic structures are based on deformed hexagonal lattices with preserved six-fold rotational symmetry patterned on suspended SiNx membranes. The experimentally measured energy-momentum dispersion of the topological lattices directly show topological band inversion by the swapping of the brightness of the bulk energy bands, and also the helical edge states when the measurement is taken near the topological interface. The robust topological transport of the helical edge modes in real space is demonstrated by successfully guiding circularly polarized light beams unidirectionally through sharp kinks without major signal loss. This work paves the way for small footprint photonic topological devices working in the short wavelength range that can also be utilized to couple to excitons for unconventional light-matter interactions at the nanoscale.

preprint2012arXiv

Tacit knowledge mining algorithm based on linguistic truth-valued concept lattice

This paper is the continuation of our research work about linguistic truth-valued concept lattice. In order to provide a mathematical tool for mining tacit knowledge, we establish a concrete model of 6-ary linguistic truth-valued concept lattice and introduce a mining algorithm through the structure consistency. Specifically, we utilize the attributes to depict knowledge, propose the 6-ary linguistic truth-valued attribute extended context and congener context to characterize tacit knowledge, and research the necessary and sufficient conditions of forming tacit knowledge. We respectively give the algorithms of generating the linguistic truth-valued congener context and constructing the linguistic truth-valued concept lattice.

Yuhui Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Wearable-informed generative digital avatars predict task-conditioned post-stroke locomotion

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

A SBP-SAT FDTD Subgridding Method Using Staggered Yee's Grids Without Modifying Field Components

An Intellectual Property Entity Recognition Method Based on Transformer and Technological Word Information

Proactive and Resilient UAV Orchestration for QoS Driven Connectivity and Coverage of Ground Users

Research on Intellectual Property Resource Profile and Evolution Law

Resilient UAV Formation for Coverage and Connectivity of Spatially Dispersed Users

SMIX($λ$): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

The Limit of the Batch Size

Truly Proximal Policy Optimization

Z_2 Photonic topological insulators in the visible wavelength range for robust nanoscale photonics

Tacit knowledge mining algorithm based on linguistic truth-valued concept lattice