Researcher profile

Huijing Zhao

Huijing Zhao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

An Active and Contrastive Learning Framework for Fine-Grained Off-Road Semantic Segmentation

Off-road semantic segmentation with fine-grained labels is necessary for autonomous vehicles to understand driving scenes, as the coarse-grained road detection can not satisfy off-road vehicles with various mechanical properties. Fine-grained semantic segmentation in off-road scenes usually has no unified category definition due to ambiguous nature environments, and the cost of pixel-wise labeling is extremely high. Furthermore, semantic properties of off-road scenes can be very changeable due to various precipitations, temperature, defoliation, etc. To address these challenges, this research proposes an active and contrastive learning-based method that does not rely on pixel-wise labels, but only on patch-based weak annotations for model learning. There is no need for predefined semantic categories, the contrastive learning-based feature representation and adaptive clustering will discover the category model from scene data. In order to actively adapt to new scenes, a risk evaluation method is proposed to discover and select hard frames with high-risk predictions for supplemental labeling, so as to update the model efficiently. Experiments conducted on our self-developed off-road dataset and DeepScene dataset demonstrate that fine-grained semantic segmentation can be learned with only dozens of weakly labeled frames, and the model can efficiently adapt across scenes by weak supervision, while achieving almost the same level of performance as typical fully supervised baselines.

preprint2022arXiv

Understanding the Challenges When 3D Semantic Segmentation Faces Class Imbalanced and OOD Data

3D semantic segmentation (3DSS) is an essential process in the creation of a safe autonomous driving system. However, deep learning models for 3D semantic segmentation often suffer from the class imbalance problem and out-of-distribution (OOD) data. In this study, we explore how the class imbalance problem affects 3DSS performance and whether the model can detect the category prediction correctness, or whether data is ID (in-distribution) or OOD. For these purposes, we conduct two experiments using three representative 3DSS models and five trust scoring methods, and conduct both a confusion and feature analysis of each class. Furthermore, a data augmentation method for the 3D LiDAR dataset is proposed to create a new dataset based on SemanticKITTI and SemanticPOSS, called AugKITTI. We propose the wPre metric and TSD for a more in-depth analysis of the results, and follow are proposals with an insightful discussion. Based on the experimental results, we find that: (1) the classes are not only imbalanced in their data size but also in the basic properties of each semantic category. (2) The intraclass diversity and interclass ambiguity make class learning difficult and greatly limit the models' performance, creating the challenges of semantic and data gaps. (3) The trust scores are unreliable for classes whose features are confused with other classes. For 3DSS models, those misclassified ID classes and OODs may also be given high trust scores, making the 3DSS predictions unreliable, and leading to the challenges in judging 3DSS result trustworthiness. All of these outcomes point to several research directions for improving the performance and reliability of the 3DSS models used for real-world applications.

preprint2021arXiv

A Survey of Deep RL and IL for Autonomous Driving Policy Learning

Autonomous driving (AD) agents generate driving policies based on online perception results, which are obtained at multiple levels of abstraction, e.g., behavior planning, motion planning and control. Driving policies are crucial to the realization of safe, efficient and harmonious driving behaviors, where AD agents still face substantial challenges in complex scenarios. Due to their successful application in fields such as robotics and video games, the use of deep reinforcement learning (DRL) and deep imitation learning (DIL) techniques to derive AD policies have witnessed vast research efforts in recent years. This paper is a comprehensive survey of this body of work, which is conducted at three levels: First, a taxonomy of the literature studies is constructed from the system perspective, among which five modes of integration of DRL/DIL models into an AD architecture are identified. Second, the formulations of DRL/DIL models for conducting specified AD tasks are comprehensively reviewed, where various designs on the model state and action spaces and the reinforcement learning rewards are covered. Finally, an in-depth review is conducted on how the critical issues of AD applications regarding driving safety, interaction with other traffic participants and uncertainty of the environment are addressed by the DRL/DIL models. To the best of our knowledge, this is the first survey to focus on AD policy learning using DRL/DIL, which is addressed simultaneously from the system, task-driven and problem-driven perspectives. We share and discuss findings, which may lead to the investigation of various topics in the future.

preprint2021arXiv

Fine-Grained Off-Road Semantic Segmentation and Mapping via Contrastive Learning

Road detection or traversability analysis has been a key technique for a mobile robot to traverse complex off-road scenes. The problem has been mainly formulated in early works as a binary classification one, e.g. associating pixels with road or non-road labels. Whereas understanding scenes with fine-grained labels are needed for off-road robots, as scenes are very diverse, and the various mechanical performance of off-road robots may lead to different definitions of safe regions to traverse. How to define and annotate fine-grained labels to achieve meaningful scene understanding for a robot to traverse off-road is still an open question. This research proposes a contrastive learning based method. With a set of human-annotated anchor patches, a feature representation is learned to discriminate regions with different traversability, a method of fine-grained semantic segmentation and mapping is subsequently developed for off-road scene understanding. Experiments are conducted on a dataset of three driving segments that represent very diverse off-road scenes. An anchor accuracy of 89.8% is achieved by evaluating the matching with human-annotated image patches in cross-scene validation. Examined by associated 3D LiDAR data, the fine-grained segments of visual images are demonstrated to have different levels of toughness and terrain elevation, which represents their semantical meaningfulness. The resultant maps contain both fine-grained labels and confidence values, providing rich information to support a robot traversing complex off-road scenes.

preprint2020arXiv

Cross Scene Prediction via Modeling Dynamic Correlation using Latent Space Shared Auto-Encoders

This work addresses on the following problem: given a set of unsynchronized history observations of two scenes that are correlative on their dynamic changes, the purpose is to learn a cross-scene predictor, so that with the observation of one scene, a robot can onlinely predict the dynamic state of another. A method is proposed to solve the problem via modeling dynamic correlation using latent space shared auto-encoders. Assuming that the inherent correlation of scene dynamics can be represented by shared latent space, where a common latent state is reached if the observations of both scenes are at an approximate time, a learning model is developed by connecting two auto-encoders through the latent space, and a prediction model is built by concatenating the encoder of the input scene with the decoder of the target one. Simulation datasets are generated imitating the dynamic flows at two adjacent gates of a campus, where the dynamic changes are triggered by a common working and teaching schedule. Similar scenarios can also be found at successive intersections on a single road, gates of a subway station, etc. Accuracy of cross-scene prediction is examined at various conditions of scene correlation and pairwise observations. Potentials of the proposed method are demonstrated by comparing with conventional end-to-end methods and linear predictions.

preprint2020arXiv

Driver Identification through Stochastic Multi-State Car-Following Modeling

Intra-driver and inter-driver heterogeneity has been confirmed to exist in human driving behaviors by many studies. In this study, a joint model of the two types of heterogeneity in car-following behavior is proposed as an approach of driver profiling and identification. It is assumed that all drivers share a pool of driver states; under each state a car-following data sequence obeys a specific probability distribution in feature space; each driver has his/her own probability distribution over the states, called driver profile, which characterize the intradriver heterogeneity, while the difference between the driver profile of different drivers depict the inter-driver heterogeneity. Thus, the driver profile can be used to distinguish a driver from others. Based on the assumption, a stochastic car-following model is proposed to take both intra-driver and inter-driver heterogeneity into consideration, and a method is proposed to jointly learn parameters in behavioral feature extractor, driver states and driver profiles. Experiments demonstrate the performance of the proposed method in driver identification on naturalistic car-following data: accuracy of 82.3% is achieved in an 8-driver experiment using 10 car-following sequences of duration 15 seconds for online inference. The potential of fast registration of new drivers are demonstrated and discussed.

preprint2020arXiv

Learning from Naturalistic Driving Data for Human-like Autonomous Highway Driving

Driving in a human-like manner is important for an autonomous vehicle to be a smart and predictable traffic participant. To achieve this goal, parameters of the motion planning module should be carefully tuned, which needs great effort and expert knowledge. In this study, a method of learning cost parameters of a motion planner from naturalistic driving data is proposed. The learning is achieved by encouraging the selected trajectory to approximate the human driving trajectory under the same traffic situation. The employed motion planner follows a widely accepted methodology that first samples candidate trajectories in the trajectory space, then select the one with minimal cost as the planned trajectory. Moreover, in addition to traditional factors such as comfort, efficiency and safety, the cost function is proposed to incorporate incentive of behavior decision like a human driver, so that both lane change decision and motion planning are coupled into one framework. Two types of lane incentive cost -- heuristic and learning based -- are proposed and implemented. To verify the validity of the proposed method, a data set is developed by using the naturalistic trajectory data of human drivers collected on the motorways in Beijing, containing samples of lane changes to the left and right lanes, and car followings. Experiments are conducted with respect to both lane change decision and motion planning, and promising results are achieved.

preprint2020arXiv

Off-road Autonomous Vehicles Traversability Analysis and Trajectory Planning Based on Deep Inverse Reinforcement Learning

Terrain traversability analysis is a fundamental issue to achieve the autonomy of a robot at off-road environments. Geometry-based and appearance-based methods have been studied in decades, while behavior-based methods exploiting learning from demonstration (LfD) are new trends. Behavior-based methods learn cost functions that guide trajectory planning in compliance with experts' demonstrations, which can be more scalable to various scenes and driving behaviors. This research proposes a method of off-road traversability analysis and trajectory planning using Deep Maximum Entropy Inverse Reinforcement Learning. To incorporate vehicle's kinematics while solving the problem of exponential increase of state-space complexity, two convolutional neural networks, i.e., RL ConvNet and Svf ConvNet, are developed to encode kinematics into convolution kernels and achieve efficient forward reinforcement learning. We conduct experiments in off-road environments. Scene maps are generated using 3D LiDAR data, and expert demonstrations are either the vehicle's real driving trajectories at the scene or synthesized ones to represent specific behaviors such as crossing negative obstacles. Different cost functions of traversability analysis are learned and tested at various scenes of capability in guiding the trajectory planning of different behaviors. We also demonstrate the performance and computation efficiency of the proposed method.

preprint2020arXiv

Off-Road Drivable Area Extraction Using 3D LiDAR Data

We propose a method for off-road drivable area extraction using 3D LiDAR data with the goal of autonomous driving application. A specific deep learning framework is designed to deal with the ambiguous area, which is one of the main challenges in the off-road environment. To reduce the considerable demand for human-annotated data for network training, we utilize the information from vast quantities of vehicle paths and auto-generated obstacle labels. Using these autogenerated annotations, the proposed network can be trained using weakly supervised or semi-supervised methods, which can achieve better performance with fewer human annotations. The experiments on our dataset illustrate the reasonability of our framework and the validity of our weakly and semi-supervised methods.

preprint2020arXiv

Scene Context Based Semantic Segmentation for 3D LiDAR Data in Dynamic Scene

We propose a graph neural network(GNN) based method to incorporate scene context for the semantic segmentation of 3D LiDAR data. The problem is defined as building a graph to represent the topology of a center segment with its neighborhoods, then inferring the segment label. The node of graph is generated from the segment on range image, which is suitable for both sparse and dense point cloud. Edge weights that evaluate the correlations of center node and its neighborhoods are automatically encoded by a neural network, therefore the number of neighbor nodes is no longer a sensitive parameter. A system consists of segment generation, graph building, edge weight estimation, node updating, and node prediction is designed. Quantitative evaluation on a dataset of dynamic scene shows that our method has better performance than unary CNN with 8% improvement, as well as normal GNN with 17% improvement.

preprint2020arXiv

Scene-Aware Error Modeling of LiDAR/Visual Odometry for Fusion-based Vehicle Localization

Localization is an essential technique in mobile robotics. In a complex environment, it is necessary to fuse different localization modules to obtain more robust results, in which the error model plays a paramount role. However, exteroceptive sensor-based odometries (ESOs), such as LiDAR/visual odometry, often deliver results with scene-related error, which is difficult to model accurately. To address this problem, this research designs a scene-aware error model for ESO, based on which a multimodal localization fusion framework is developed. In addition, an end-to-end learning method is proposed to train this error model using sparse global poses such as GPS/IMU results. The proposed method is realized for error modeling of LiDAR/visual odometry, and the results are fused with dead reckoning to examine the performance of vehicle localization. Experiments are conducted using both simulation and real-world data of experienced and unexperienced environments, and the experimental results demonstrate that with the learned scene-aware error models, vehicle localization accuracy can be largely improved and shows adaptiveness in unexperienced scenes.

preprint2020arXiv

SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances

3D semantic segmentation is one of the key tasks for autonomous driving system. Recently, deep learning models for 3D semantic segmentation task have been widely researched, but they usually require large amounts of training data. However, the present datasets for 3D semantic segmentation are lack of point-wise annotation, diversiform scenes and dynamic objects. In this paper, we propose the SemanticPOSS dataset, which contains 2988 various and complicated LiDAR scans with large quantity of dynamic instances. The data is collected in Peking University and uses the same data format as SemanticKITTI. In addition, we evaluate several typical 3D semantic segmentation models on our SemanticPOSS dataset. Experimental results show that SemanticPOSS can help to improve the prediction accuracy of dynamic objects as people, car in some degree. SemanticPOSS will be published at \url{www.poss.pku.edu.cn}.