Source author record

Ding Zhao

Ding Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Robotics Artificial Intelligence Computer Vision Computation and Language eess.SP Multiagent Systems Sound eess.AS Applications Cryptography and Security Methodology Multimedia Systems and Control

Catalog footprint

What is connected

26works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

In this paper, we propose a dynamic cascaded encoder Automatic Speech Recognition (ASR) model, which unifies models for different deployment scenarios. Moreover, the model can significantly reduce model size and power consumption without loss of quality. Namely, with the dynamic cascaded encoder model, we explore three techniques to maximally boost the performance of each model size: 1) Use separate decoders for each sub-model while sharing the encoders; 2) Use funnel-pooling to improve the encoder efficiency; 3) Balance the size of causal and non-causal encoders to improve quality and fit deployment constraints. Overall, the proposed large-medium model has 30% smaller size and reduces power consumption by 33%, compared to the baseline cascaded encoder model. The triple-size model that unifies the large, medium, and small models achieves 37% total size reduction with minimal quality loss, while substantially reducing the engineering efforts of having separate models.

preprint2022arXiv

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications. Previous primal-dual style approaches suffer from instability issues and lack optimality guarantees. This paper overcomes the issues from the perspective of probabilistic inference. We introduce a novel Expectation-Maximization approach to naturally incorporate constraints during the policy learning: 1) a provable optimal non-parametric variational distribution could be computed in closed form after a convex optimization (E-step); 2) the policy parameter is improved within the trust region based on the optimal variational distribution (M-step). The proposed algorithm decomposes the safe RL problem into a convex optimization phase and a supervised learning phase, which yields a more stable training performance. A wide range of experiments on continuous robotic tasks shows that the proposed method achieves significantly better constraint satisfaction performance and better sample efficiency than baselines. The code is available at https://github.com/liuzuxin/cvpo-safe-rl.

preprint2022arXiv

COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks

As reinforcement learning (RL) has achieved near human-level performance in a variety of tasks, its robustness has raised great attention. While a vast body of research has explored test-time (evasion) attacks in RL and corresponding defenses, its robustness against training-time (poisoning) attacks remains largely unanswered. In this work, we focus on certifying the robustness of offline RL in the presence of poisoning attacks, where a subset of training trajectories could be arbitrarily manipulated. We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated regarding different certification criteria. Given the complex structure of RL, we propose two certification criteria: per-state action stability and cumulative reward bound. To further improve the certification, we propose new partition and aggregation protocols to train robust policies. We further prove that some of the proposed certification methods are theoretically tight and some are NP-Complete problems. We leverage COPA to certify three RL environments trained with different algorithms and conclude: (1) The proposed robust aggregation protocols such as temporal aggregation can significantly improve the certifications; (2) Our certification for both per-state action stability and cumulative reward bound are efficient and tight; (3) The certification for different training algorithms and environments are different, implying their intrinsic robustness properties. All experimental results are available at https://copa-leaderboard.github.io.

preprint2022arXiv

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

As reinforcement learning (RL) has achieved great success and been even adopted in safety-critical domains such as autonomous vehicles, a range of empirical studies have been conducted to improve its robustness against adversarial attacks. However, how to certify its robustness with theoretical guarantees still remains challenging. In this paper, we present the first unified framework CROP (Certifying Robust Policies for RL) to provide robustness certification on both action and reward levels. In particular, we propose two robustness certification criteria: robustness of per-state actions and lower bound of cumulative rewards. We then develop a local smoothing algorithm for policies derived from Q-functions to guarantee the robustness of actions taken along the trajectory; we also develop a global smoothing algorithm for certifying the lower bound of a finite-horizon cumulative reward, as well as a novel local smoothing algorithm to perform adaptive search in order to obtain tighter reward certification. Empirically, we apply CROP to evaluate several existing empirically robust RL algorithms, including adversarial training and different robust regularization, in four environments (two representative Atari games, Highway, and CartPole). Furthermore, by evaluating these algorithms against adversarial attacks, we demonstrate that our certification are often tight. All experiment results are available at website https://crop-leaderboard.github.io.

preprint2022arXiv

Federated Pruning: Improving Neural Network Efficiency with Federated Learning

Automatic Speech Recognition models require large amount of speech data for training, and the collection of such data often leads to privacy concerns. Federated learning has been widely used and is considered to be an effective decentralized technique by collaboratively learning a shared prediction model while keeping the data local on different clients devices. However, the limited computation and communication resources on clients devices present practical difficulties for large models. To overcome such challenges, we propose Federated Pruning to train a reduced model under the federated setting, while maintaining similar performance compared to the full model. Moreover, the vast amount of clients data can also be leveraged to improve the pruning results compared to centralized training. We explore different pruning schemes and provide empirical evidence of the effectiveness of our methods.

preprint2022arXiv

GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction

There has been an increased interest in applying deep neural networks to automatically interpret and analyze the 12-lead electrocardiogram (ECG). The current paradigms with machine learning methods are often limited by the amount of labeled data. This phenomenon is particularly problematic for clinically-relevant data, where labeling at scale can be time-consuming and costly in terms of the specialized expertise and human effort required. Moreover, deep learning classifiers may be vulnerable to adversarial examples and perturbations, which could have catastrophic consequences, for example, when applied in the context of medical treatment, clinical trials, or insurance claims. In this paper, we propose a physiologically-inspired data augmentation method to improve performance and increase the robustness of heart disease detection based on ECG signals. We obtain augmented samples by perturbing the data distribution towards other classes along the geodesic in Wasserstein space. To better utilize domain-specific knowledge, we design a ground metric that recognizes the difference between ECG signals based on physiologically determined features. Learning from 12-lead ECG signals, our model is able to distinguish five categories of cardiac conditions. Our results demonstrate improvements in accuracy and robustness, reflecting the effectiveness of our data augmentation method.

preprint2022arXiv

Investigating the Impact of Multi-LiDAR Placement on Object Detection for Autonomous Driving

The past few years have witnessed an increasing interest in improving the perception performance of LiDARs on autonomous vehicles. While most of the existing works focus on developing new deep learning algorithms or model architectures, we study the problem from the physical design perspective, i.e., how different placements of multiple LiDARs influence the learning-based perception. To this end, we introduce an easy-to-compute information-theoretic surrogate metric to quantitatively and fast evaluate LiDAR placement for 3D detection of different types of objects. We also present a new data collection, detection model training and evaluation framework in the realistic CARLA simulator to evaluate disparate multi-LiDAR configurations. Using several prevalent placements inspired by the designs of self-driving companies, we show the correlation between our surrogate metric and object detection performance of different representative algorithms on KITTI through extensive experiments, validating the effectiveness of our LiDAR placement evaluation approach. Our results show that sensor placement is non-negligible in 3D point cloud-based object detection, which will contribute up to 10% performance discrepancy in terms of average precision in challenging 3D object detection settings. We believe that this is one of the first studies to quantitatively investigate the influence of LiDAR placement on perception performance. The code is available on https://github.com/HanjiangHu/Multi-LiDAR-Placement-for-3D-Detection.

preprint2022arXiv

Joint transmit and reflective beamforming for IRS-assisted integrated sensing and communication

This paper studies an intelligent reflecting surface (IRS)-assisted integrated sensing and communication (ISAC) system, in which one IRS is deployed to not only assist the wireless communication from a multi-antenna base station (BS) to a single-antenna communication user (CU), but also create virtual line-of-sight (LoS) links for sensing targets at areas with LoS links blocked. We consider that the BS transmits combined information and sensing signals for ISAC. Under this setup, we jointly optimize the transmit information and sensing beamforming at the BS and the reflective beamforming at the IRS, to maximize the IRS's minimum beampattern gain towards the desired sensing angles, subject to the minimum signal-to-noise ratio (SNR) requirement at the CU and the maximum transmit power constraint at the BS. Although the formulated SNR-constrained beampattern gain maximization problem is non-convex and difficult to solve, we present an efficient algorithm to obtain a high-quality solution using alternating optimization and semi-definite relaxation (SDR). Numerical results show that the proposed joint beamforming design achieves improved sensing performance while ensuring the communication requirement as compared to benchmarks without such joint optimization. It is also shown that the use of dedicated sensing beams is beneficial in enhancing the performance for IRS-assisted ISAC.

preprint2022arXiv

MHMS: Multimodal Hierarchical Multimedia Summarization

Multimedia summarization with multimodal output can play an essential role in real-world applications, i.e., automatically generating cover images and titles for news articles or providing introductions to online videos. In this work, we propose a multimodal hierarchical multimedia summarization (MHMS) framework by interacting visual and language domains to generate both video and textual summaries. Our MHMS method contains video and textual segmentation and summarization module, respectively. It formulates a cross-domain alignment objective with optimal transport distance which leverages cross-domain interaction to generate the representative keyframe and textual summary. We evaluated MHMS on three recent multimodal datasets and demonstrated the effectiveness of our method in producing high-quality multimodal summaries.

preprint2022arXiv

PhysioMTL: Personalizing Physiological Patterns using Optimal Transport Multi-Task Regression

Heart rate variability (HRV) is a practical and noninvasive measure of autonomic nervous system activity, which plays an essential role in cardiovascular health. However, using HRV to assess physiology status is challenging. Even in clinical settings, HRV is sensitive to acute stressors such as physical activity, mental stress, hydration, alcohol, and sleep. Wearable devices provide convenient HRV measurements, but the irregularity of measurements and uncaptured stressors can bias conventional analytical methods. To better interpret HRV measurements for downstream healthcare applications, we learn a personalized diurnal rhythm as an accurate physiological indicator for each individual. We develop Physiological Multitask-Learning (PhysioMTL) by harnessing Optimal Transport theory within a Multitask-learning (MTL) framework. The proposed method learns an individual-specific predictive model from heterogeneous observations, and enables estimation of an optimal transport map that yields a push forward operation onto the demographic features for each task. Our model outperforms competing MTL methodologies on unobserved predictive tasks for synthetic and two real-world datasets. Specifically, our method provides remarkable prediction results on unseen held-out subjects given only $20\%$ of the subjects in real-world observational studies. Furthermore, our model enables a counterfactual engine that generates the effect of acute stressors and chronic conditions on HRV rhythms.

preprint2022arXiv

Privacy of Autonomous Vehicles: Risks, Protection Methods, and Future Directions

Recent advances in machine learning have enabled its wide application in different domains, and one of the most exciting applications is autonomous vehicles (AVs), which have encouraged the development of a number of ML algorithms from perception to prediction to planning. However, training AVs usually requires a large amount of training data collected from different driving environments (e.g., cities) as well as different types of personal information (e.g., working hours and routes). Such collected large data, treated as the new oil for ML in the data-centric AI era, usually contains a large amount of privacy-sensitive information which is hard to remove or even audit. Although existing privacy protection approaches have achieved certain theoretical and empirical success, there is still a gap when applying them to real-world applications such as autonomous vehicles. For instance, when training AVs, not only can individually identifiable information reveal privacy-sensitive information, but also population-level information such as road construction within a city, and proprietary-level commercial secrets of AVs. Thus, it is critical to revisit the frontier of privacy risks and corresponding protection approaches in AVs to bridge this gap. Following this goal, in this work, we provide a new taxonomy for privacy risks and protection methods in AVs, and we categorize privacy in AVs into three levels: individual, population, and proprietary. We explicitly list out recent challenges to protect each of these levels of privacy, summarize existing solutions to these challenges, discuss the lessons and conclusions, and provide potential future directions and opportunities for both researchers and practitioners. We believe this work will help to shape the privacy research in AV and guide the privacy protection technology design.

preprint2022arXiv

Prompting Decision Transformer for Few-Shot Policy Generalization

Humans can leverage prior experience and learn novel tasks from a handful of demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve quick adaptation through better algorithm design, we investigate the effect of architecture inductive bias on the few-shot learning capability. We propose a Prompt-based Decision Transformer (Prompt-DT), which leverages the sequential modeling ability of the Transformer architecture and the prompt framework to achieve few-shot adaptation in offline RL. We design the trajectory prompt, which contains segments of the few-shot demonstrations, and encodes task-specific information to guide policy generation. Our experiments in five MuJoCo control benchmarks show that Prompt-DT is a strong few-shot learner without any extra finetuning on unseen target tasks. Prompt-DT outperforms its variants and strong meta offline RL baselines by a large margin with a trajectory prompt containing only a few timesteps. Prompt-DT is also robust to prompt length changes and can generalize to out-of-distribution (OOD) environments.

preprint2022arXiv

Test Against High-Dimensional Uncertainties: Accelerated Evaluation of Autonomous Vehicles with Deep Importance Sampling

Evaluating the performance of autonomous vehicles (AV) and their complex subsystems to high precision under naturalistic circumstances remains a challenge, especially when failure or dangerous cases are rare. Rarity does not only require an enormous sample size for a naive method to achieve high confidence estimation, but it also causes dangerous underestimation of the true failure rate and it is extremely hard to detect. Meanwhile, the state-of-the-art approach that comes with a correctness guarantee can only compute an upper bound for the failure rate under certain conditions, which could limit its practical uses. In this work, we present Deep Importance Sampling (Deep IS) framework that utilizes a deep neural network to obtain an efficient IS that is on par with the state-of-the-art, capable of reducing the required sample size 43 times smaller than the naive sampling method to achieve 10% relative error and while producing an estimate that is much less conservative. Our high-dimensional experiment estimating the misclassification rate of one of the state-of-the-art traffic sign classifiers further reveals that this efficiency still holds true even when the target is very small, achieving over 600 times efficiency boost. This highlights the potential of Deep IS in providing a precise estimate even against high-dimensional uncertainties.

preprint2021arXiv

Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

This paper studies the constrained/safe reinforcement learning (RL) problem with sparse indicator signals for constraint violations. We propose a model-based approach to enable RL agents to effectively explore the environment with unknown system dynamics and environment constraints given a significantly small number of violation budgets. We employ the neural network ensemble model to estimate the prediction uncertainty and use model predictive control as the basic control framework. We propose the robust cross-entropy method to optimize the control sequence considering the model uncertainty and constraints. We evaluate our methods in the Safety Gym environment. The results show that our approach learns to complete the tasks with a much smaller number of constraint violations than state-of-the-art baselines. Additionally, we are able to achieve several orders of magnitude better sample efficiency when compared with constrained model-free RL approaches. The code is available at \url{https://github.com/liuzuxin/safe-mbrl}.

preprint2021arXiv

Context-Aware Safe Reinforcement Learning for Non-Stationary Environments

Safety is a critical concern when deploying reinforcement learning agents for realistic tasks. Recently, safe reinforcement learning algorithms have been developed to optimize the agent's performance while avoiding violations of safety constraints. However, few studies have addressed the non-stationary disturbances in the environments, which may cause catastrophic outcomes. In this paper, we propose the context-aware safe reinforcement learning (CASRL) method, a meta-learning framework to realize safe adaptation in non-stationary environments. We use a probabilistic latent variable model to achieve fast inference of the posterior environment transition distribution given the context data. Safety constraints are then evaluated with uncertainty-aware trajectory sampling. The high cost of safety violations leads to the rareness of unsafe records in the dataset. We address this issue by enabling prioritized sampling during model training and formulating prior safety constraints with domain knowledge during constrained planning. The algorithm is evaluated in realistic safety-critical environments with non-stationary disturbances. Results show that the proposed algorithm significantly outperforms existing baselines in terms of safety and robustness.

preprint2021arXiv

Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical Systems

Evaluating the reliability of intelligent physical systems against rare safety-critical events poses a huge testing burden for real-world applications. Simulation provides a useful platform to evaluate the extremal risks of these systems before their deployments. Importance Sampling (IS), while proven to be powerful for rare-event simulation, faces challenges in handling these learning-based systems due to their black-box nature that fundamentally undermines its efficiency guarantee, which can lead to under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the safety-critical event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of an intelligent driving algorithm.

preprint2021arXiv

Dynamic Sparsity Neural Networks for Automatic Speech Recognition

In automatic speech recognition (ASR), model pruning is a widely adopted technique that reduces model size and latency to deploy neural network models on edge devices with resource constraints. However, multiple models with different sparsity levels usually need to be separately trained and deployed to heterogeneous target hardware with different resource specifications and for applications that have various latency requirements. In this paper, we present Dynamic Sparsity Neural Networks (DSNN) that, once trained, can instantly switch to any predefined sparsity configuration at run-time. We demonstrate the effectiveness and flexibility of DSNN using experiments on internal production datasets with Google Voice Search data, and show that the performance of a DSNN model is on par with that of individually trained single sparsity networks. Our trained DSNN model, therefore, can greatly ease the training process and simplify deployment in diverse scenarios with resource constraints.

preprint2020arXiv

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking. In this paper, we develop a first-pass Recurrent Neural Network Transducer (RNN-T) model and a second-pass Listen, Attend, Spell (LAS) rescorer that surpasses a conventional model in both quality and latency. On the quality side, we incorporate a large number of utterances across varied domains to increase acoustic diversity and the vocabulary seen by the model. We also train with accented English speech to make the model more robust to different pronunciations. In addition, given the increased amount of training data, we explore a varied learning rate schedule. On the latency front, we explore using the end-of-sentence decision emitted by the RNN-T model to close the microphone, and also introduce various optimizations to improve the speed of LAS rescoring. Overall, we find that RNN-T+LAS offers a better WER and latency tradeoff compared to a conventional model. For example, for the same latency, RNN-T+LAS obtains a 8% relative improvement in WER, while being more than 400-times smaller in model size.

preprint2020arXiv

Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments

Action and observation delays exist prevalently in the real-world cyber-physical systems which may pose challenges in reinforcement learning design. It is particularly an arduous task when handling multi-agent systems where the delay of one agent could spread to other agents. To resolve this problem, this paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks with model-free deep reinforcement learning. We formally define the Delay-Aware Markov Game that incorporates the delays of all agents in the environment. To solve Delay-Aware Markov Games, we apply centralized training and decentralized execution that allows agents to use extra information to ease the non-stationarity issue of the multi-agent systems during training, without the need of a centralized controller during execution. Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments. We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness. Results show that the proposed delay-aware multi-agent reinforcement learning algorithm greatly alleviates the performance degradation introduced by delay. Codes and demo videos are available at: https://github.com/baimingc/delay-aware-MARL.

preprint2020arXiv

Learning to Collide: An Adaptive Safety-Critical Scenarios Generating Method

Long-tail and rare event problems become crucial when autonomous driving algorithms are applied in the real world. For the purpose of evaluating systems in challenging settings, we propose a generative framework to create safety-critical scenarios for evaluating specific task algorithms. We first represent the traffic scenarios with a series of autoregressive building blocks and generate diverse scenarios by sampling from the joint distribution of these blocks. We then train the generative model as an agent (or a generator) to investigate the risky distribution parameters for a given driving algorithm being evaluated. We regard the task algorithm as an environment (or a discriminator) that returns a reward to the agent when a risky scenario is generated. Through the experiments conducted on several scenarios in the simulation, we demonstrate that the proposed framework generates safety-critical scenarios more efficiently than grid search or human design methods. Another advantage of this method is its adaptiveness to the routes and parameters.

preprint2020arXiv

MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments

Multi-agent navigation in dynamic environments is of great industrial value when deploying a large scale fleet of robot to real-world applications. This paper proposes a decentralized partially observable multi-agent path planning with evolutionary reinforcement learning (MAPPER) method to learn an effective local planning policy in mixed dynamic environments. Reinforcement learning-based methods usually suffer performance degradation on long-horizon tasks with goal-conditioned sparse rewards, so we decompose the long-range navigation task into many easier sub-tasks under the guidance of a global planner, which increases agents' performance in large environments. Moreover, most existing multi-agent planning approaches assume either perfect information of the surrounding environment or homogeneity of nearby dynamic agents, which may not hold in practice. Our approach models dynamic obstacles' behavior with an image-based representation and trains a policy in mixed dynamic environments without homogeneity assumption. To ensure multi-agent training stability and performance, we propose an evolutionary training approach that can be easily scaled to large and complex environments. Experiments show that MAPPER is able to achieve higher success rates and more stable performance when exposed to a large number of non-cooperative dynamic obstacles compared with traditional reaction-based planner LRA* and the state-of-the-art learning-based method.

preprint2020arXiv

Robust Unsupervised Learning of Temporal Dynamic Interactions

Robust representation learning of temporal dynamic interactions is an important problem in robotic learning in general and automated unsupervised learning in particular. Temporal dynamic interactions can be described by (multiple) geometric trajectories in a suitable space over which unsupervised learning techniques may be applied to extract useful features from raw and high-dimensional data measurements. Taking a geometric approach to robust representation learning for temporal dynamic interactions, it is necessary to develop suitable metrics and a systematic methodology for comparison and for assessing the stability of an unsupervised learning method with respect to its tuning parameters. Such metrics must account for the (geometric) constraints in the physical world as well as the uncertainty associated with the learned patterns. In this paper we introduce a model-free metric based on the Procrustes distance for robust representation learning of interactions, and an optimal transport based distance metric for comparing between distributions of interaction primitives. These distance metrics can serve as an objective for assessing the stability of an interaction learning algorithm. They are also used for comparing the outcomes produced by different algorithms. Moreover, they may also be adopted as an objective function to obtain clusters and representative interaction primitives. These concepts and techniques will be introduced, along with mathematical properties, while their usefulness will be demonstrated in unsupervised learning of vehicle-to-vechicle interactions extracted from the Safety Pilot database, the world's largest database for connected vehicles.

preprint2019arXiv

A General Framework of Learning Multi-Vehicle Interaction Patterns from Videos

Semantic learning and understanding of multi-vehicle interaction patterns in a cluttered driving environment are essential but challenging for autonomous vehicles to make proper decisions. This paper presents a general framework to gain insights into intricate multi-vehicle interaction patterns from bird's-eye view traffic videos. We adopt a Gaussian velocity field to describe the time-varying multi-vehicle interaction behaviors and then use deep autoencoders to learn associated latent representations for each temporal frame. Then, we utilize a hidden semi-Markov model with a hierarchical Dirichlet process as a prior to segment these sequential representations into granular components, also called traffic primitives, corresponding to interaction patterns. Experimental results demonstrate that our proposed framework can extract traffic primitives from videos, thus providing a semantic way to analyze multi-vehicle interaction patterns, even for cluttered driving scenarios that are far messier than human beings can cope with.

preprint2019arXiv

Clustering of Driving Encounter Scenarios Using Connected Vehicle Trajectories

Multi-vehicle interaction behavior classification and analysis offer in-depth knowledge to make an efficient decision for autonomous vehicles. This paper aims to cluster a wide range of driving encounter scenarios based only on multi-vehicle GPS trajectories. Towards this end, we propose a generic unsupervised learning framework comprising two layers: feature representation layer and clustering layer. In the layer of feature representation, we combine the deep autoencoders with a distance-based measure to map the sequential observations of driving encounters into a computationally tractable space that allows quantifying the spatiotemporal interaction characteristics of two vehicles. The clustering algorithm is then applied to the extracted representations to gather homogeneous driving encounters into groups. Our proposed generic framework is then evaluated using 2,568 naturalistic driving encounters. Experimental results demonstrate that our proposed generic framework incorporated with unsupervised learning can cluster multi-trajectory data into distinct groups. These clustering results could benefit decision-making policy analysis and design for autonomous vehicles.

preprint2016arXiv

Accelerated Evaluation of Automated Vehicles Safety in Lane Change Scenarios Based on Importance Sampling Techniques

Automated vehicles (AVs) must be evaluated thoroughly before their release and deployment. A widely-used evaluation approach is the Naturalistic-Field Operational Test (N-FOT), which tests prototype vehicles directly on the public roads. Due to the low exposure to safety-critical scenarios, N-FOTs are time-consuming and expensive to conduct. In this paper, we propose an accelerated evaluation approach for AVs. The results can be used to generate motions of the primary other vehicles to accelerate the verification of AVs in simulations and controlled experiments. Frontal collision due to unsafe cut-ins is the target crash type of this paper. Human-controlled vehicles making unsafe lane changes are modeled as the primary disturbance to AVs based on data collected by the University of Michigan Safety Pilot Model Deployment Program. The cut-in scenarios are generated based on skewed statistics of collected human driver behaviors, which generate risky testing scenarios while preserving the statistical information so that the safety benefits of AVs in non-accelerated cases can be accurately estimated. The Cross Entropy method is used to recursively search for the optimal skewing parameters. The frequencies of occurrence of conflicts, crashes and injuries are estimated for a modeled automated vehicle, and the achieved accelerated rate is around 2,000 to 20,000. In other words, in the accelerated simulations, driving for 1,000 miles will expose the AV with challenging scenarios that will take about 2 to 20 million miles of real-world driving to encounter. This technique thus has the potential to reduce greatly the development and validation time for AVs.

preprint2016arXiv

Enhancement of Low-cost GNSS Localization in Connected Vehicle Networks Using Rao-Blackwellized Particle Filters

An essential function for automated vehicle technologies is accurate localization. It is difficult, however, to achieve lane-level accuracy with low-cost Global Navigation Satellite System (GNSS) receivers due to the biased noisy pseudo-range measurements. Approaches such as Differential GNSS can improve the accuracy, but usually require an enormous amount of investment in base stations. The emerging connected vehicle technologies provide an alternative approach to improving the localization accuracy. It has been shown in this paper that localization accuracy can be enhanced by fusing GNSS information within a group of connected vehicles and matching the configuration of the group to a digital map to eliminate the common bias in localization. A Rao-Blackwellized particle filter (RBPF) was used to jointly estimate the common biases of the pseudo-ranges and the vehicles positions. Multipath biases, which are non-common to vehicles, were mitigated by a multi-hypothesis detection-rejection approach. The temporal correlation was exploited through the prediction-update process. The proposed approach was compared to the existing static and smoothed static methods in the intersection scenario. Simulation results show that the proposed algorithm reduced the estimation error by fifty percent and reduced the estimation variance by two orders of magnitude.

Ding Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

26 published item(s)

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Constrained Variational Policy Optimization for Safe Reinforcement Learning

COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

Federated Pruning: Improving Neural Network Efficiency with Federated Learning

GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction

Investigating the Impact of Multi-LiDAR Placement on Object Detection for Autonomous Driving

Joint transmit and reflective beamforming for IRS-assisted integrated sensing and communication

MHMS: Multimodal Hierarchical Multimedia Summarization

PhysioMTL: Personalizing Physiological Patterns using Optimal Transport Multi-Task Regression

Privacy of Autonomous Vehicles: Risks, Protection Methods, and Future Directions

Prompting Decision Transformer for Few-Shot Policy Generalization

Test Against High-Dimensional Uncertainties: Accelerated Evaluation of Autonomous Vehicles with Deep Importance Sampling

Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

Context-Aware Safe Reinforcement Learning for Non-Stationary Environments

Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical Systems

Dynamic Sparsity Neural Networks for Automatic Speech Recognition

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments

Learning to Collide: An Adaptive Safety-Critical Scenarios Generating Method

MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments

Robust Unsupervised Learning of Temporal Dynamic Interactions

A General Framework of Learning Multi-Vehicle Interaction Patterns from Videos

Clustering of Driving Encounter Scenarios Using Connected Vehicle Trajectories

Accelerated Evaluation of Automated Vehicles Safety in Lane Change Scenarios Based on Importance Sampling Techniques

Enhancement of Low-cost GNSS Localization in Connected Vehicle Networks Using Rao-Blackwellized Particle Filters