Source author record

Shuhan Tan

Shuhan Tan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Robotics

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding

Recent advances in egocentric video understanding models are promising, but their heavy computational expense is a barrier for many real-world applications. To address this challenge, we propose EgoDistill, a distillation-based approach that learns to reconstruct heavy egocentric video clip features by combining the semantics from a sparse set of video frames with the head motion from lightweight IMU readings. We further devise a novel self-supervised training strategy for IMU feature learning. Our method leads to significant improvements in efficiency, requiring 200x fewer GFLOPs than equivalent video models. We demonstrate its effectiveness on the Ego4D and EPICKitchens datasets, where our method outperforms state-of-the-art efficient video understanding methods.

preprint2021arXiv

SceneGen: Learning to Generate Realistic Traffic Scenes

We consider the problem of generating realistic traffic scenes automatically. Existing methods typically insert actors into the scene according to a set of hand-crafted heuristics and are limited in their ability to model the true complexity and diversity of real traffic scenes, thus inducing a content gap between synthesized traffic scenes versus real ones. As a result, existing simulators lack the fidelity necessary to train and test self-driving vehicles. To address this limitation, we present SceneGen, a neural autoregressive model of traffic scenes that eschews the need for rules and heuristics. In particular, given the ego-vehicle state and a high definition map of surrounding area, SceneGen inserts actors of various classes into the scene and synthesizes their sizes, orientations, and velocities. We demonstrate on two large-scale datasets SceneGen's ability to faithfully model distributions of real traffic scenes. Moreover, we show that SceneGen coupled with sensor simulation can be used to train perception models that generalize to the real world.

preprint2020arXiv

Class-imbalanced Domain Adaptation: An Empirical Odyssey

Unsupervised domain adaptation is a promising way to generalize deep models to novel domains. However, the current literature assumes that the label distribution is domain-invariant and only aligns the feature distributions or vice versa. In this work, we explore the more realistic task of Class-imbalanced Domain Adaptation: How to align feature distributions across domains while the label distributions of the two domains are also different? Taking a practical step towards this problem, we constructed the first benchmark with 22 cross-domain tasks from 6real-image datasets. We conducted comprehensive experiments on 10 recent domain adaptation methods and find most of them are very fragile in the face of coexisting feature and label distribution shift. Towards a better solution, we further proposed a feature and label distribution CO-ALignment (COAL) model with a novel combination of existing ideas. COAL is empirically shown to outperform the most recent domain adaptation methods on our benchmarks. We believe the provided benchmarks, empirical analysis results, and the COAL baseline could stimulate and facilitate future research towards this important problem.

preprint2020arXiv

LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World

We tackle the problem of producing realistic simulations of LiDAR point clouds, the sensor of preference for most self-driving vehicles. We argue that, by leveraging real data, we can simulate the complex world more realistically compared to employing virtual worlds built from CAD/procedural models. Towards this goal, we first build a large catalog of 3D static maps and 3D dynamic objects by driving around several cities with our self-driving fleet. We can then generate scenarios by selecting a scene from our catalog and "virtually" placing the self-driving vehicle (SDV) and a set of dynamic objects from the catalog in plausible locations in the scene. To produce realistic simulations, we develop a novel simulator that captures both the power of physics-based and learning-based simulation. We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation, producing realistic LiDAR point clouds. We showcase LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios.

Shuhan Tan

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding

SceneGen: Learning to Generate Realistic Traffic Scenes

Class-imbalanced Domain Adaptation: An Empirical Odyssey

LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World