Researcher profile

Zhenbo Lu

Zhenbo Lu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Coordinate-Aligned Multi-Camera Collaboration for Active Multi-Object Tracking

Active Multi-Object Tracking (AMOT) is a task where cameras are controlled by a centralized system to adjust their poses automatically and collaboratively so as to maximize the coverage of targets in their shared visual field. In AMOT, each camera only receives partial information from its observation, which may mislead cameras to take locally optimal action. Besides, the global goal, i.e., maximum coverage of objects, is hard to be directly optimized. To address the above issues, we propose a coordinate-aligned multi-camera collaboration system for AMOT. In our approach, we regard each camera as an agent and address AMOT with a multi-agent reinforcement learning solution. To represent the observation of each agent, we first identify the targets in the camera view with an image detector, and then align the coordinates of the targets in 3D environment. We define the reward of each agent based on both global coverage as well as four individual reward terms. The action policy of the agents is derived with a value-based Q-network. To the best of our knowledge, we are the first to study the AMOT task. To train and evaluate the efficacy of our system, we build a virtual yet credible 3D environment, named "Soccer Court", to mimic the real-world AMOT scenario. The experimental results show that our system achieves a coverage of 71.88%, outperforming the baseline method by 8.9%.

preprint2022arXiv

Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods

Actor-critic Reinforcement Learning (RL) algorithms have achieved impressive performance in continuous control tasks. However, they still suffer two nontrivial obstacles, i.e., low sample efficiency and overestimation bias. To this end, we propose Simultaneous Double Q-learning with Conservative Advantage Learning (SDQ-CAL). Our SDQ-CAL boosts the Double Q-learning for off-policy actor-critic RL based on a modification of the Bellman optimality operator with Advantage Learning. Specifically, SDQ-CAL improves sample efficiency by modifying the reward to facilitate the distinction from experience between the optimal actions and the others. Besides, it mitigates the overestimation issue by updating a pair of critics simultaneously upon double estimators. Extensive experiments reveal that our algorithm realizes less biased value estimation and achieves state-of-the-art performance in a range of continuous control benchmark tasks. We release the source code of our method at: \url{https://github.com/LQNew/SDQ-CAL}.