Researcher profile

Zhenbo Song

Zhenbo Song contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2024arXiv

FuRPE: Learning Full-body Reconstruction from Part Experts

In the field of full-body reconstruction, the scarcity of annotated data often impedes the efficacy of prevailing methods. To address this issue, we introduce FuRPE, a novel framework that employs part-experts and an ingenious pseudo ground-truth selection scheme to derive high-quality pseudo labels. These labels, central to our approach, equip our network with the capability to efficiently learn from the available data. Integral to FuRPE is a unique exponential moving average training strategy and expert-derived feature distillation strategy. These novel elements of FuRPE not only serve to further refine the model but also to reduce potential biases that may arise from inaccuracies in pseudo labels, thereby optimizing the network's training process and enhancing the robustness of the model. We apply FuRPE to train both two-stage and fully convolutional single-stage full-body reconstruction networks. Our exhaustive experiments on numerous benchmark datasets illustrate a substantial performance boost over existing methods, underscoring FuRPE's potential to reshape the state-of-the-art in full-body reconstruction.

preprint2022arXiv

Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image

Recently, RGBD-based category-level 6D object pose estimation has achieved promising improvement in performance, however, the requirement of depth information prohibits broader applications. In order to relieve this problem, this paper proposes a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation. We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation. Two novel modules named Normalized Global Position Hints (NGPH) and Shape-aware Decoupled Depth Reconstruction (SDDR) module are introduced to learn high fidelity object-level depth and delicate shape representations. At last, the 6D object pose is solved by aligning the predicted canonical representation with the back-projected object-level depth. Extensive experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model, though simple, achieves state-of-the-art performance.

preprint2022arXiv

RPR-Net: A Point Cloud-based Rotation-aware Large Scale Place Recognition Network

Point cloud-based large scale place recognition is an important but challenging task for many applications such as Simultaneous Localization and Mapping (SLAM). Taking the task as a point cloud retrieval problem, previous methods have made delightful achievements. However, how to deal with catastrophic collapse caused by rotation problems is still under-explored. In this paper, to tackle the issue, we propose a novel Point Cloud-based Rotation-aware Large Scale Place Recognition Network (RPR-Net). In particular, to solve the problem, we propose to learn rotation-invariant features in three steps. First, we design three kinds of novel Rotation-Invariant Features (RIFs), which are low-level features that can hold the rotation-invariant property. Second, using these RIFs, we design an attentive module to learn rotation-invariant kernels. Third, we apply these kernels to previous point cloud features to generate new features, which is the well-known SO(3) mapping process. By doing so, high-level scene-specific rotation-invariant features can be learned. We call the above process an Attentive Rotation-Invariant Convolution (ARIConv). To achieve the place recognition goal, we build RPR-Net, which takes ARIConv as a basic unit to construct a dense network architecture. Then, powerful global descriptors used for retrieval-based place recognition can be sufficiently extracted from RPR-Net. Experimental results on prevalent datasets show that our method achieves comparable results to existing state-of-the-art place recognition models and significantly outperforms other rotation-invariant baseline models when solving rotation problems.

preprint2020arXiv

End-to-end Learning for Inter-Vehicle Distance and Relative Velocity Estimation in ADAS with a Monocular Camera

Inter-vehicle distance and relative velocity estimations are two basic functions for any ADAS (Advanced driver-assistance systems). In this paper, we propose a monocular camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network. The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames, which include deep feature clue, scene geometry clue, as well as temporal optical flow clue. We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field (i.e. optical flow). We implement the method by a light-weight deep neural network. Extensive experiments are conducted which confirm the superior performance of our method over other state-of-the-art methods, in terms of estimation accuracy, computational speed, and memory footprint.