Source author record

Qihang Zhang

Qihang Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Robotics Artificial Intelligence cond-mat.mes-hall cond-mat.str-el

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

Formulated as a conditional generation problem, face animation aims at synthesizing continuous face images from a single source image driven by a set of conditional face motion. Previous works mainly model the face motion as conditions with 1D or 2D representation (e.g., action units, emotion codes, landmark), which often leads to low-quality results in some complicated scenarios such as continuous generation and largepose transformation. To tackle this problem, the conditions are supposed to meet two requirements, i.e., motion information preserving and geometric continuity. To this end, we propose a novel representation based on a 3D geometric flow, termed facial flow, to represent the natural motion of the human face at any pose. Compared with other previous conditions, the proposed facial flow well controls the continuous changes to the face. After that, in order to utilize the facial flow for face editing, we build a synthesis framework generating continuous images with conditional facial flows. To fully take advantage of the motion information of facial flows, a hierarchical conditional framework is designed to combine the extracted multi-scale appearance features from images and motion features from flows in a hierarchical manner. The framework then decodes multiple fused features back to images progressively. Experimental results demonstrate the effectiveness of our method compared to other state-of-the-art methods.

preprint2022arXiv

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

Deep visuomotor policy learning, which aims to map raw visual observation to action, achieves promising results in control tasks such as robotic manipulation and autonomous driving. However, it requires a huge number of online interactions with the training environment, which limits its real-world application. Compared to the popular unsupervised feature learning for visual recognition, feature pretraining for visuomotor control tasks is much less explored. In this work, we aim to pretrain policy representations for driving tasks by watching hours-long uncurated YouTube videos. Specifically, we train an inverse dynamic model with a small amount of labeled data and use it to predict action labels for all the YouTube video frames. A new contrastive policy pretraining method is then developed to learn action-conditioned features from the video frames with pseudo action labels. Experiments show that the resulting action-conditioned features obtain substantial improvements for the downstream reinforcement learning and imitation learning tasks, outperforming the weights pretrained from previous unsupervised learning methods and ImageNet pretrained weight. Code, model weights, and data are available at: https://metadriverse.github.io/ACO.

preprint2022arXiv

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Driving safely requires multiple capabilities from human and intelligent agents, such as the generalizability to unseen environments, the safety awareness of the surrounding traffic, and the decision-making in complex multi-agent settings. Despite the great success of Reinforcement Learning (RL), most of the RL research works investigate each capability separately due to the lack of integrated environments. In this work, we develop a new driving simulation platform called MetaDrive to support the research of generalizable reinforcement learning algorithms for machine autonomy. MetaDrive is highly compositional, which can generate an infinite number of diverse driving scenarios from both the procedural generation and the real data importing. Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic. The generalization experiments conducted on both procedurally generated scenarios and real-world scenarios show that increasing the diversity and the size of the training set leads to the improvement of the RL agent's generalizability. We further evaluate various safe reinforcement learning and multi-agent reinforcement learning algorithms in MetaDrive environments and provide the benchmarks. Source code, documentation, and demo video are available at \url{ https://metadriverse.github.io/metadrive}.

preprint2022arXiv

Spectroscopy Signatures of Electron Correlations in a Trilayer Graphene/hBN Moiré Superlattice

ABC-stacked trilayer graphene/hBN moiré superlattice (TLG/hBN) has emerged as a playground for correlated electron physics. We report spectroscopy measurements of dual-gated TLG/hBN using Fourier transformed infrared photocurrent spectroscopy. We observed a strong optical transition between moiré mini-bands that narrows continuously as a bandgap is opened by gating, indicating a reduction of the single particle bandwidth. At half-filling of the valence flat band, a broad absorption peak emerges at ~18 meV, indicating direct optical excitation across an emerging Mott gap. Similar photocurrent spectra are observed in two other correlated insulating states at quarter- and half-filling of the first conduction band. Our findings provide key parameters of the Hubbard model for the understanding of electron correlation in TLG/hBN.

Qihang Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Spectroscopy Signatures of Electron Correlations in a Trilayer Graphene/hBN Moiré Superlattice