Researcher profile

Wenjie Lu

Wenjie Lu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

Graphical User Interface (GUI) agents aim to automate a wide spectrum of human tasks by emulating user interaction. Despite rapid advancements, current approaches are hindered by several critical challenges: data bottleneck in end-to-end training, high cost of delayed error detection, and risk of contradictory guidance. Inspired by the human cognitive loop of Thinking, Alignment, and Reflection, we present D-Artemis -- a novel deliberative framework in this paper. D-Artemis leverages a fine-grained, app-specific tip retrieval mechanism to inform its decision-making process. It also employs a proactive Pre-execution Alignment stage, where Thought-Action Consistency (TAC) Check module and Action Correction Agent (ACA) work in concert to mitigate the risk of execution failures. A post-execution Status Reflection Agent (SRA) completes the cognitive loop, enabling strategic learning from experience. Crucially, D-Artemis enhances the capabilities of general-purpose Multimodal large language models (MLLMs) for GUI tasks without the need for training on complex trajectory datasets, demonstrating strong generalization. D-Artemis establishes new state-of-the-art (SOTA) results across both major benchmarks, achieving a 75.8% success rate on AndroidWorld and 96.8% on ScreenSpot-V2. Extensive ablation studies further demonstrate the significant contribution of each component to the framework.

preprint2022arXiv

Non-cascaded Control Barrier Functions for the Safe Control of Quadrotors

Researchers have developed various cascaded controllers and non-cascaded controllers for the navigation and control of quadrotors in recent years. It is vital to ensure the safety of a quadrotor both in normal state and in abnormal state if a controller tends to make the quadrotor unsafe. To this end, this paper proposes a non-cascaded Control Barrier Function (CBF) for a quadrotor controlled by either cascaded controllers or a non-cascaded controller. Incorporated with a Quadratic Programming (QP), the non-cascaded CBF can simultaneously regulate the magnitude of the total thrust and the torque of the quadrotor determined a controller, so as to ensure the safety of the quadrotor both in normal state and in abnormal state. The non-cascaded CBF establishes a non-conservative forward invariant safe region, in which the controller of a quadrotor is fully or partially effective in the navigation or the pose control of the quadrotor. The non-cascaded CBF is applied to a quadrotor performing trajectory tracking and a quadrotor performing aggressive roll maneuvers in simulations to evaluate the effectiveness of the non-cascaded CBF.

preprint2022arXiv

Safe Reinforcement Learning for a Robot Being Pursued but with Objectives Covering More Than Capture-avoidance

Reinforcement Learning (RL) algorithms show amazing performance in recent years, but placing RL in real-world applications such as self-driven vehicles may suffer safety problems. A self-driven vehicle moving to a target position following a learned policy may suffer a vehicle with unpredictable aggressive behaviors or even being pursued by a vehicle following a Nash strategy. To address the safety issue of the self-driven vehicle in this scenario, this paper conducts a preliminary study based on a system of robots. A safe RL framework with safety guarantees is developed for a robot being pursued but with objectives covering more than capture-avoidance. Simulations and experiments are conducted based on the system of robots to evaluate the effectiveness of the developed safe RL framework.

preprint2020arXiv

DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances

This paper presents an observer-integrated Reinforcement Learning (RL) approach, called Disturbance OBserver Network (DOB-Net), for robots operating in environments where disturbances are unknown and time-varying, and may frequently exceed robot control capabilities. The DOB-Net integrates a disturbance dynamics observer network and a controller network. Originated from conventional DOB mechanisms, the observer is built and enhanced via Recurrent Neural Networks (RNNs), encoding estimation of past values and prediction of future values of unknown disturbances in RNN hidden state. Such encoding allows the controller generate optimal control signals to actively reject disturbances, under the constraints of robot control capabilities. The observer and the controller are jointly learned within policy optimization by advantage actor critic. Numerical simulations on position regulation tasks have demonstrated that the proposed DOB-Net significantly outperforms a conventional feedback controller and classical RL algorithms.

preprint2020arXiv

Modular Transfer Learning with Transition Mismatch Compensation for Excessive Disturbance Rejection

Underwater robots in shallow waters usually suffer from strong wave forces, which may frequently exceed robot's control constraints. Learning-based controllers are suitable for disturbance rejection control, but the excessive disturbances heavily affect the state transition in Markov Decision Process (MDP) or Partially Observable Markov Decision Process (POMDP). Also, pure learning procedures on targeted system may encounter damaging exploratory actions or unpredictable system variations, and training exclusively on a prior model usually cannot address model mismatch from the targeted system. In this paper, we propose a transfer learning framework that adapts a control policy for excessive disturbance rejection of an underwater robot under dynamics model mismatch. A modular network of learning policies is applied, composed of a Generalized Control Policy (GCP) and an Online Disturbance Identification Model (ODI). GCP is first trained over a wide array of disturbance waveforms. ODI then learns to use past states and actions of the system to predict the disturbance waveforms which are provided as input to GCP (along with the system state). A transfer reinforcement learning algorithm using Transition Mismatch Compensation (TMC) is developed based on the modular architecture, that learns an additional compensatory policy through minimizing mismatch of transitions predicted by the two dynamics models of the source and target tasks. We demonstrated on a pose regulation task in simulation that TMC is able to successfully reject the disturbances and stabilize the robot under an empirical model of the robot system, meanwhile improve sample efficiency.