Source author record

Shaoshuai Mou

Shaoshuai Mou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Systems and Control eess.SY Distributed, Parallel, and Cluster Computing Machine Learning Robotics math.OC cs.CY Multiagent Systems Cryptography and Security Networking and Internet Architecture

Catalog footprint

What is connected

13works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Learning from Human Directional Corrections

This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections -- corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot's current motion relative to an unknown objective function. The allowable corrections satisfying this assumption account for half of the input space, as opposed to the magnitude corrections which have to lie in a shrinking level set. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment. The results confirm the convergence of the proposed method and further show that the method is significantly more effective (higher success rate), efficient/effortless (less human corrections needed), and potentially more accessible (fewer early wasted trials) than the state-of-the-art robot learning frameworks.

preprint2022arXiv

Learning from Sparse Demonstrations

This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the robot's actual execution. The method jointly finds an objective function and a time-warping function such that the robot's resulting trajectory sequentially follows the keyframes with minimal discrepancy loss. The Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of the robot trajectory with respect to the unknown parameters. The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments. The results show the efficiency of the method, its ability to handle time misalignment between keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.

preprint2022arXiv

Learning Objective Functions Incrementally by Inverse Optimal Control

This paper proposes an inverse optimal control method which enables a robot to incrementally learn a control objective function from a collection of trajectory segments. By saying incrementally, it means that the collection of trajectory segments is enlarged because additional segments are provided as time evolves. The unknown objective function is parameterized as a weighted sum of features with unknown weights. Each trajectory segment is a small snippet of optimal trajectory. The proposed method shows that each trajectory segment, if informative, can pose a linear constraint to the unknown weights, thus, the objective function can be learned by incrementally incorporating all informative segments. Effectiveness of the method is shown on a simulated 2-link robot arm and a 6-DoF maneuvering quadrotor system, in each of which only small demonstration segments are available.

preprint2021arXiv

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.

preprint2020arXiv

Distributed traffic control for a large-scale urban network

Motivated by the fact that intelligent traffic control systems have become inevitable demand to cope with the risk of traffic congestion in urban areas, this paper develops a distributed control strategy for urban traffic networks. Since these networks contain a large number of roads having different directions, each of them can be described as a multi-agent system. Thus, a coordination among traffic flows is required to optimize the operation of the overall network. In order to determine control decisions, we describe the objective of improving traffic conditions as a constrained optimization problem with respect to downstream traffic flows. By applying the gradient projection method and the minimal polynomial of a matrix pair, we propose algorithms that allow each road cell to determine its control decision corresponding to the optimal solution while using only its local information. The effectiveness of our proposed algorithms is validated by numerical simulations.

preprint2020arXiv

Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime Measures

A set of about 80 researchers, practitioners, and federal agency program managers participated in the NSF-sponsored Grand Challenges in Resilience Workshop held on Purdue campus on March 19-21, 2019. The workshop was divided into three themes: resilience in cyber, cyber-physical, and socio-technical systems. About 30 attendees in all participated in the discussions of cyber resilience. This article brings out the substantive parts of the challenges and solution approaches that were identified in the cyber resilience theme. In this article, we put forward the substantial challenges in cyber resilience in a few representative application domains and outline foundational solutions to address these challenges. These solutions fall into two broad themes: resilience-by-design and resilience-by-reaction. We use examples of autonomous systems as the application drivers motivating cyber resilience. We focus on some autonomous systems in the near horizon (autonomous ground and aerial vehicles) and also a little more distant (autonomous rescue and relief). For resilience-by-design, we focus on design methods in software that are needed for our cyber systems to be resilient. In contrast, for resilience-by-reaction, we discuss how to make systems resilient by responding, reconfiguring, or recovering at runtime when failures happen. We also discuss the notion of adaptive execution to improve resilience, execution transparently and adaptively among available execution platforms (mobile/embedded, edge, and cloud). For each of the two themes, we survey the current state, and the desired state and ways to get there. We conclude the paper by looking at the research challenges we will have to solve in the short and the mid-term to make the vision of resilient autonomous systems a reality.

preprint2020arXiv

Neural Certificates for Safe Control Policies

This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.

preprint2019arXiv

Resilient Cyberphysical Systems and their Application Drivers: A Technology Roadmap

Cyberphysical systems (CPS) are ubiquitous in our personal and professional lives, and they promise to dramatically improve micro-communities (e.g., urban farms, hospitals), macro-communities (e.g., cities and metropolises), urban structures (e.g., smart homes and cars), and living structures (e.g., human bodies, synthetic genomes). The question that we address in this article pertains to designing these CPS systems to be resilient-from-the-ground-up, and through progressive learning, resilient-by-reaction. An optimally designed system is resilient to both unique attacks and recurrent attacks, the latter with a lower overhead. Overall, the notion of resilience can be thought of in the light of three main sources of lack of resilience, as follows: exogenous factors, such as natural variations and attack scenarios; mismatch between engineered designs and exogenous factors ranging from DDoS (distributed denial-of-service) attacks or other cybersecurity nightmares, so called "black swan" events, disabling critical services of the municipal electrical grids and other connected infrastructures, data breaches, and network failures; and the fragility of engineered designs themselves encompassing bugs, human-computer interactions (HCI), and the overall complexity of real-world systems. In the paper, our focus is on design and deployment innovations that are broadly applicable across a range of CPS application areas.

preprint2016arXiv

Impacts of Network Topology on the Performance of a Distributed Algorithm Solving Linear Equations

Recently a distributed algorithm has been proposed for multi-agent networks to solve a system of linear algebraic equations, by assuming each agent only knows part of the system and is able to communicate with nearest neighbors to update their local solutions. This paper investigates how the network topology impacts exponential convergence of the proposed algorithm. It is found that networks with higher mean degree, smaller diameter, and homogeneous degree distribution tend to achieve faster convergence. Both analytical and numerical results are provided.

preprint2016arXiv

Request-Based Gossiping without Deadlocks

By the distributed averaging problem is meant the problem of computing the average value of a set of numbers possessed by the agents in a distributed network using only communication between neighboring agents. Gossiping is a well-known approach to the problem which seeks to iteratively arrive at a solution by allowing each agent to interchange information with at most one neighbor at each iterative step. Crafting a gossiping protocol which accomplishes this is challenging because gossiping is an inherently collaborative process which can lead to deadlocks unless careful precautions are taken to ensure that it does not. Many gossiping protocols are request-based which means simply that a gossip between two agents will occur whenever one of the two agents accepts a request to gossip placed by the other. In this paper, we present three deterministic request-based protocols. We show by example that the first can deadlock. The second is guaranteed to avoid deadlocks and requires fewer transmissions per iteration than standard broadcast-based distributed averaging protocols by exploiting the idea of local ordering together with the notion of an agent's neighbor queue; the protocol requires the simplest queue updates, which provides an in-depth understanding of how local ordering and queue updates avoid deadlocks. It is shown that a third protocol which uses a slightly more complicated queue update rule can lead to significantly faster convergence; a worst case bound on convergence rate is provided.

preprint2015arXiv

A Distributed Algorithm for Solving a Linear Algebraic Equation

A distributed algorithm is described for solving a linear algebraic equation of the form $Ax=b$ assuming the equation has at least one solution. The equation is simultaneously solved by $m$ agents assuming each agent knows only a subset of the rows of the partitioned matrix $(A,b)$, the current estimates of the equation's solution generated by its neighbors, and nothing more. Each agent recursively updates its estimate by utilizing the current estimates generated by each of its neighbors. Neighbor relations are characterized by a time-dependent directed graph $\mathbb{N}(t)$ whose vertices correspond to agents and whose arcs depict neighbor relations. It is shown that for any matrix $A$ for which the equation has a solution and any sequence of "repeatedly jointly strongly connected graphs" $\mathbb{N}(t)$, $t=1,2,\ldots$, the algorithm causes all agents' estimates to converge exponentially fast to the same solution to $Ax=b$. It is also shown that the neighbor graph sequence must actually be repeatedly jointly strongly connected if exponential convergence is to be assured. A worst case convergence rate bound is derived for the case when $Ax=b$ has a unique solution. It is demonstrated that with minor modification, the algorithm can track the solution to $Ax = b$, even if $A$ and $b$ are changing with time, provided the rates of change of $A$ and $b$ are sufficiently small. It is also shown that in the absence of communication delays, exponential convergence to a solution occurs even if the times at which each agent updates its estimates are not synchronized with the update times of its neighbors. A modification of the algorithm is outlined which enables it to obtain a least squares solution to $Ax=b$ in a distributed manner, even if $Ax=b$ does not have a solution.

preprint2015arXiv

Decentralized gradient algorithm for solution of a linear equation

The paper develops a technique for solving a linear equation $Ax=b$ with a square and nonsingular matrix $A$, using a decentralized gradient algorithm. In the language of control theory, there are $n$ agents, each storing at time $t$ an $n$-vector, call it $x_i(t)$, and a graphical structure associating with each agent a vertex of a fixed, undirected and connected but otherwise arbitrary graph $\mathcal G$ with vertex set and edge set $\mathcal V$ and $\mathcal E$ respectively. We provide differential equation update laws for the $x_i$ with the property that each $x_i$ converges to the solution of the linear equation exponentially fast. The equation for $x_i$ includes additive terms weighting those $x_j$ for which vertices in $\mathcal G$ corresponding to the $i$-th and $j$-th agents are adjacent. The results are extended to the case where $A$ is not square but has full row rank, and bounds are given on the convergence rate.

preprint2015arXiv

Undirected Rigid Formations are Problematic

By an undirected rigid formation of mobile autonomous agents is meant a formation based on graph rigidity in which each pair of "neighboring" agents is responsible for maintaining a prescribed target distance between them. In a recent paper a systematic method was proposed for devising gradient control laws for asymptotically stabilizing a large class of rigid, undirected formations in two dimensional space assuming all agents are described by kinematic point models. The aim of this paper is to explain what happens to such formations if neighboring agents have slightly different understandings of what the desired distance between them is supposed to be or equivalently if neighboring agents have differing estimates of what the actual distance between them is. In either case, what one would expect would be a gradual distortion of the formation from its target shape as discrepancies in desired or sensed distances increase. While this is observed for the gradient laws in question, something else quite unexpected happens at the same time. It is shown that for any rigidity-based, undirected formation of this type which is comprised of three or more agents, that if some neighboring agents have slightly different understandings of what the desired distances between them are suppose to be, then almost for certain, the trajectory of the resulting distorted but rigid formation will converge exponentially fast to a closed circular orbit in two-dimensional space which is traversed periodically at a constant angular speed.

Shaoshuai Mou

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Learning from Human Directional Corrections

Learning from Sparse Demonstrations

Learning Objective Functions Incrementally by Inverse Optimal Control

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

Distributed traffic control for a large-scale urban network

Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime Measures

Neural Certificates for Safe Control Policies

Resilient Cyberphysical Systems and their Application Drivers: A Technology Roadmap

Impacts of Network Topology on the Performance of a Distributed Algorithm Solving Linear Equations

Request-Based Gossiping without Deadlocks

A Distributed Algorithm for Solving a Linear Algebraic Equation

Decentralized gradient algorithm for solution of a linear equation

Undirected Rigid Formations are Problematic