Researcher profile

Ye Shi

Ye Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2023arXiv

HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations

Existing gait recognition benchmarks mostly include minor clothing variations in the laboratory environments, but lack persistent changes in appearance over time and space. In this paper, we propose the first in-the-wild benchmark CCGait for cloth-changing gait recognition, which incorporates diverse clothing changes, indoor and outdoor scenes, and multi-modal statistics over 92 days. To further address the coupling effect of clothing and viewpoint variations, we propose a hybrid approach HybridGait that exploits both temporal dynamics and the projected 2D information of 3D human meshes. Specifically, we introduce a Canonical Alignment Spatial-Temporal Transformer (CA-STT) module to encode human joint position-aware features, and fully exploit 3D dense priors via a Silhouette-guided Deformation with 3D-2D Appearance Projection (SilD) strategy. Our contributions are twofold: we provide a challenging benchmark CCGait that captures realistic appearance changes across an expanded and space, and we propose a hybrid framework HybridGait that outperforms prior works on CCGait and Gait3D benchmarks. Our project page is available at https://github.com/HCVLab/HybridGait.

preprint2023arXiv

Unified Optimal Transport Framework for Universal Domain Adaptation

Universal Domain Adaptation (UniDA) aims to transfer knowledge from a source domain to a target domain without any constraints on label sets. Since both domains may hold private classes, identifying target common samples for domain alignment is an essential issue in UniDA. Most existing methods require manually specified or hand-tuned threshold values to detect common samples thus they are hard to extend to more realistic UniDA because of the diverse ratios of common classes. Moreover, they cannot recognize different categories among target-private samples as these private samples are treated as a whole. In this paper, we propose to use Optimal Transport (OT) to handle these issues under a unified framework, namely UniOT. First, an OT-based partial alignment with adaptive filling is designed to detect common classes without any predefined threshold values for realistic UniDA. It can automatically discover the intrinsic difference between common and private classes based on the statistical information of the assignment matrix obtained from OT. Second, we propose an OT-based target representation learning that encourages both global discrimination and local consistency of samples to avoid the over-reliance on the source. Notably, UniOT is the first method with the capability to automatically discover and recognize private categories in the target domain for UniDA. Accordingly, we introduce a new metric H^3-score to evaluate the performance in terms of both accuracy of common samples and clustering performance of private ones. Extensive experiments clearly demonstrate the advantages of UniOT over a wide range of state-of-the-art methods in UniDA.

preprint2022arXiv

Mutual Adaptive Reasoning for Monocular 3D Multi-Person Pose Estimation

Inter-person occlusion and depth ambiguity make estimating the 3D poses of monocular multiple persons as camera-centric coordinates a challenging problem. Typical top-down frameworks suffer from high computational redundancy with an additional detection stage. By contrast, the bottom-up methods enjoy low computational costs as they are less affected by the number of humans. However, most existing bottom-up methods treat camera-centric 3D human pose estimation as two unrelated subtasks: 2.5D pose estimation and camera-centric depth estimation. In this paper, we propose a unified model that leverages the mutual benefits of both these subtasks. Within the framework, a robust structured 2.5D pose estimation is designed to recognize inter-person occlusion based on depth relationships. Additionally, we develop an end-to-end geometry-aware depth reasoning method that exploits the mutual benefits of both 2.5D pose and camera-centric root depths. This method first uses 2.5D pose and geometry information to infer camera-centric root depths in a forward pass, and then exploits the root depths to further improve representation learning of 2.5D pose estimation in a backward pass. Further, we designed an adaptive fusion scheme that leverages both visual perception and body geometry to alleviate inherent depth ambiguity issues. Extensive experiments demonstrate the superiority of our proposed model over a wide range of bottom-up methods. Our accuracy is even competitive with top-down counterparts. Notably, our model runs much faster than existing bottom-up and top-down methods.