Researcher profile

Bowen Weng

Bowen Weng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

A Finite-Sampling, Operational Domain Specific, and Provably Unbiased Connected and Automated Vehicle Safety Metric

A connected and automated vehicle safety metric determines the performance of a subject vehicle (SV) by analyzing the data involving the interactions among the SV and other dynamic road users and environmental features. When the data set contains only a finite set of samples collected from the naturalistic mixed-traffic driving environment, a metric is expected to generalize the safety assessment outcome from the observed finite samples to the unobserved cases by specifying in what domain the SV is expected to be safe and how safe the SV is, statistically, in that domain. However, to the best of our knowledge, none of the existing safety metrics are able to justify the above properties with an operational domain specific, guaranteed complete, and provably unbiased safety evaluation outcome. In this paper, we propose a novel safety metric that involves the $α$-shape and the $ε$-almost robustly forward invariant set to characterize the SV's almost safe operable domain and the probability for the SV to remain inside the safe domain indefinitely, respectively. The empirical performance of the proposed method is demonstrated in several different operational design domains through a series of cases covering a variety of fidelity levels (real-world and simulators), driving environments (highway, urban, and intersections), road users (car, truck, and pedestrian), and SV driving behaviors (human driver and self driving algorithms).

preprint2022arXiv

A Formal Safety Characterization of Advanced Driver Assist Systems in the Car-Following Regime with Scenario-Sampling

The capability to follow a lead-vehicle and avoid rear-end collisions is one of the most important functionalities for human drivers and various Advanced Driver Assist Systems (ADAS). Existing safety performance justification of the car-following systems either relies on simple concrete scenarios with biased surrogate metrics or requires a significantly long driving distance for risk observation and inference. In this paper, we propose a guaranteed unbiased and sampling efficient scenario-based safety evaluation framework inspired by the previous work on $εδ$-almost safe set quantification. The proposal characterizes the complete safety performance of the test subject in the car-following regime. The performance of the proposed method is also demonstrated in challenging cases including some widely adopted car-following decision-making modules and the commercially available Openpilot driving stack by CommaAI.

preprint2022arXiv

On Safety Testing, Validation, and Characterization with Scenario-Sampling: A Case Study of Legged Robots

The dynamic response of the legged robot locomotion is non-Lipschitz and can be stochastic due to environmental uncertainties. To test, validate, and characterize the safety performance of legged robots, existing solutions on observed and inferred risk can be incomplete and sampling inefficient. Some formal verification methods suffer from the model precision and other surrogate assumptions. In this paper, we propose a scenario sampling based testing framework that characterizes the overall safety performance of a legged robot by specifying (i) where (in terms of a set of states) the robot is potentially safe, and (ii) how safe the robot is within the specified set. The framework can also help certify the commercial deployment of the legged robot in real-world environment along with human and compare safety performance among legged robots with different mechanical structures and dynamic properties. The proposed framework is further deployed to evaluate a group of state-of-the-art legged robot locomotion controllers from various model-based, deep neural network involved, and reinforcement learning based methods in the literature. Among a series of intended work domains of the studied legged robots (e.g. tracking speed on sloped surface, with abrupt changes on demanded velocity, and against adversarial push-over disturbances), we show that the method can adequately capture the overall safety characterization and the subtle performance insights. Many of the observed safety outcomes, to the best of our knowledge, have never been reported by the existing work in the legged robot literature.

preprint2020arXiv

Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent

Existing convergence analyses of Q-learning mostly focus on the vanilla stochastic gradient descent (SGD) type of updates. Despite the Adaptive Moment Estimation (Adam) has been commonly used for practical Q-learning algorithms, there has not been any convergence guarantee provided for Q-learning with such type of updates. In this paper, we first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update (a commonly adopted alternative of Adam for theoretical analysis). To further improve the performance, we propose to incorporate the momentum restart scheme to Q-AMSGrad, resulting in the so-called Q-AMSGradR algorithm. The convergence rate of Q-AMSGradR is also established. Our experiments on a linear quadratic regulator problem show that the two proposed Q-learning algorithms outperform the vanilla Q-learning with SGD updates. The two algorithms also exhibit significantly better performance than the DQN learning method over a batch of Atari 2600 games.

preprint2020arXiv

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

Variance-reduced algorithms, although achieve great theoretical performance, can run slowly in practice due to the periodic gradient estimation with a large batch of data. Batch-size adaptation thus arises as a promising approach to accelerate such algorithms. However, existing schemes either apply prescribed batch-size adaption rule or exploit the information along optimization path via additional backtracking and condition verification steps. In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients. We further theoretically show that such a scheme substantially reduces the overall complexity for popular variance-reduced algorithms SVRG and SARAH/SPIDER for both conventional nonconvex optimization and reinforcement learning problems. To this end, we develop a new convergence analysis framework to handle the dependence of the batch size on history stochastic gradients. Extensive experiments validate the effectiveness of the proposed batch-size adaptation scheme.

preprint2020arXiv

Model Predictive Instantaneous Safety Metric for Evaluation of Automated Driving Systems

Vehicles with Automated Driving Systems (ADS) operate in a high-dimensional continuous system with multi-agent interactions. This continuous system features various types of traffic agents (non-homogeneous) governed by continuous-motion ordinary differential equations (differential-drive). Each agent makes decisions independently that may lead to conflicts with the subject vehicle (SV), as well as other participants (non-cooperative). A typical vehicle safety evaluation procedure that uses various safety-critical scenarios and observes resultant collisions (or near collisions), is not sufficient enough to evaluate the performance of the ADS in terms of operational safety status maintenance. In this paper, we introduce a Model Predictive Instantaneous Safety Metric (MPrISM), which determines the safety status of the SV, considering the worst-case safety scenario for a given traffic snapshot. The method then analyzes the SV's closeness to a potential collision within a certain evaluation time period. The described metric induces theoretical guarantees of safety in terms of the time to collision under standard assumptions. Through formulating the solution as a series of minimax quadratic optimization problems of a specific structure, the method is tractable for real-time safety evaluation applications. Its capabilities are demonstrated with synthesized examples and cases derived from real-world tests.

preprint2020arXiv

Momentum Q-learning with Finite-Sample Convergence Guarantee

Existing studies indicate that momentum ideas in conventional optimization can be used to improve the performance of Q-learning algorithms. However, the finite-sample analysis for momentum-based Q-learning algorithms is only available for the tabular case without function approximations. This paper analyzes a class of momentum-based Q-learning algorithms with finite-sample guarantee. Specifically, we propose the MomentumQ algorithm, which integrates the Nesterov's and Polyak's momentum schemes, and generalizes the existing momentum-based Q-learning algorithms. For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling. In particular, we characterize the finite-sample convergence rate which is provably faster than the vanilla Q-learning. This is the first finite-sample analysis for momentum-based Q-learning algorithms with function approximations. For the tabular case under synchronous sampling, we also obtain a finite-sample convergence rate that is slightly better than the SpeedyQ \citep{azar2011speedy} when choosing a special family of step sizes. Finally, we demonstrate through various experiments that the proposed MomentumQ outperforms other momentum-based Q-learning algorithms.

preprint2020arXiv

Reciprocal Collision Avoidance for General Nonlinear Agents using Reinforcement Learning

Finding feasible and collision-free paths for multiple nonlinear agents is challenging in the decentralized scenarios due to limited available information of other agents and complex dynamics constraints. In this paper, we propose a fast multi-agent collision avoidance algorithm for general nonlinear agents with continuous action space, where each agent observes only positions and velocities of nearby agents. To reduce online computation, we first decompose the multi-agent scenario and solve a two agents collision avoidance problem using reinforcement learning (RL). When extending the trained policy to a multi-agent problem, safety is ensured by introducing the optimal reciprocal collision avoidance (ORCA) as linear constraints and the overall collision avoidance action could be found through simple convex optimization. Most existing RL-based multi-agent collision avoidance algorithms rely on the direct control of agent velocities. In sharp contrasts, our approach is applicable to general nonlinear agents. Realistic simulations based on nonlinear bicycle agent models are performed with various challenging scenarios, indicating a competitive performance of the proposed method in avoiding collisions, congestion and deadlock with smooth trajectories.

preprint2020arXiv

Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller

This paper presents a neural-network based adaptive feedback control structure to regulate the velocity of 3D bipedal robots under dynamics uncertainties. Existing Hybrid Zero Dynamics (HZD)-based controllers regulate velocity through the implementation of heuristic regulators that do not consider model and environmental uncertainties, which may significantly affect the tracking performance of the controllers. In this paper, we address the uncertainties in the robot dynamics from the perspective of the reduced dimensional representation of virtual constraints and propose the integration of an adaptive neural network-based controller to regulate the robot velocity in the presence of model parameter uncertainties. The proposed approach yields improved tracking performance under dynamics uncertainties. The shallow adaptive neural network used in this paper does not require training a priori and has the potential to be implemented on the real-time robotic controller. A comparative simulation study of a 3D Cassie robot is presented to illustrate the performance of the proposed approach under various scenarios.