Source author record

Joel W. Burdick

Joel W. Burdick appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics math.OC Systems and Control eess.SY Machine Learning Artificial Intelligence Computer Vision Human-Computer Interaction eess.IV Formal Languages and Automata Theory Multiagent Systems

Catalog footprint

What is connected

20works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Distributionally Robust Model Predictive Control with Total Variation Distance

This paper studies the problem of distributionally robust model predictive control (MPC) using total variation distance ambiguity sets. For a discrete-time linear system with additive disturbances, we provide a conditional value-at-risk reformulation of the MPC optimization problem that is distributionally robust in the expected cost and chance constraints. The distributionally robust chance constraint is over-approximated as a simpler, tightened chance constraint that reduces the computational burden. Numerical experiments support our results on probabilistic guarantees and computational efficiency.

preprint2022arXiv

Moving Obstacle Avoidance: a Data-Driven Risk-Aware Approach

This paper proposes a new structured method for a moving agent to predict the paths of dynamically moving obstacles and avoid them using a risk-aware model predictive control (MPC) scheme. Given noisy measurements of the a priori unknown obstacle trajectory, a bootstrapping technique predicts a set of obstacle trajectories. The bootstrapped predictions are incorporated in the MPC optimization using a risk-aware methodology so as to provide probabilistic guarantees on obstacle avoidance. We validate our methods using simulations of a 3-dimensional multi-rotor drone that avoids various moving obstacles, such as a thrown ball and a frisbee with air drag.

preprint2022arXiv

Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first developing a sample-based method to bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. We also mention how our approach can be extended to account for any $g$-entropic risk measure - the subset of coherent risk measures on which we focus.

preprint2021arXiv

Learning Invariant Representation of Tasks for Robust Surgical State Estimation

Surgical state estimators in robot-assisted surgery (RAS) - especially those trained via learning techniques - rely heavily on datasets that capture surgeon actions in laboratory or real-world surgical tasks. Real-world RAS datasets are costly to acquire, are obtained from multiple surgeons who may use different surgical strategies, and are recorded under uncontrolled conditions in highly complex environments. The combination of high diversity and limited data calls for new learning methods that are robust and invariant to operating conditions and surgical techniques. We propose StiseNet, a Surgical Task Invariance State Estimation Network with an invariance induction framework that minimizes the effects of variations in surgical technique and operating environments inherent to RAS datasets. StiseNet's adversarial architecture learns to separate nuisance factors from information needed for surgical state estimation. StiseNet is shown to outperform state-of-the-art state estimation methods on three datasets (including a new real-world RAS dataset: HERNIA-20).

preprint2021arXiv

ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes

Characterizing what types of exoskeleton gaits are comfortable for users, and understanding the science of walking more generally, require recovering a user's utility landscape. Learning these landscapes is challenging, as walking trajectories are defined by numerous gait parameters, data collection from human trials is expensive, and user safety and comfort must be ensured. This work proposes the Region of Interest Active Learning (ROIAL) framework, which actively learns each user's underlying utility function over a region of interest that ensures safety and comfort. ROIAL learns from ordinal and preference feedback, which are more reliable feedback mechanisms than absolute numerical scores. The algorithm's performance is evaluated both in simulation and experimentally for three non-disabled subjects walking inside of a lower-body exoskeleton. ROIAL learns Bayesian posteriors that predict each exoskeleton user's utility landscape across four exoskeleton gait parameters. The algorithm discovers both commonalities and discrepancies across users' gait preferences and identifies the gait parameters that most influenced user feedback. These results demonstrate the feasibility of recovering gait utility landscapes from limited human trials.

preprint2020arXiv

Barrier Functions for Multiagent-POMDPs with DTL Specifications

Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfaction of high-level specifications in terms of linear distribution temporal logic (LDTL). To this end, we use sufficient and necessary conditions for the invariance of a given set based on discrete-time barrier functions (DTBFs) and formulate sufficient conditions for finite time DTBF to study finite time convergence to a set. We then show that different LDTL mission/safety specifications can be cast as a set of invariance or finite time reachability problems. We demonstrate that the proposed method for safety-shield synthesis can be implemented online by a sequence of one-step greedy algorithms. We demonstrate the efficacy of the proposed method using experiments involving a team of robots.

preprint2020arXiv

Dueling Posterior Sampling for Preference-Based Reinforcement Learning

In preference-based reinforcement learning (RL), an agent interacts with the environment while receiving preferences instead of absolute feedback. While there is increasing research activity in preference-based RL, the design of formal frameworks that admit tractable theoretical analysis remains an open challenge. Building upon ideas from preference-based bandit learning and posterior sampling in RL, we present DUELING POSTERIOR SAMPLING (DPS), which employs preference-based posterior sampling to learn both the system dynamics and the underlying utility function that governs the preference feedback. As preference feedback is provided on trajectories rather than individual state-action pairs, we develop a Bayesian approach for the credit assignment problem, translating preferences to a posterior distribution over state-action reward models. We prove an asymptotic Bayesian no-regret rate for DPS with a Bayesian linear regression credit assignment model. This is the first regret guarantee for preference-based RL to our knowledge. We also discuss possible avenues for extending the proof methodology to other credit assignment models. Finally, we evaluate the approach empirically, showing competitive performance against existing baselines.

preprint2020arXiv

Energy-Efficient Motion Planning for Multi-Modal Hybrid Locomotion

Hybrid locomotion, which combines multiple modalities of locomotion within a single robot, enables robots to carry out complex tasks in diverse environments. This paper presents a novel method for planning multi-modal locomotion trajectories using approximate dynamic programming. We formulate this problem as a shortest-path search through a state-space graph, where the edge cost is assigned as optimal transport cost along each segment. This cost is approximated from batches of offline trajectory optimizations, which allows the complex effects of vehicle under-actuation and dynamic constraints to be approximately captured in a tractable way. Our method is illustrated on a hybrid double-integrator, an amphibious robot, and a flying-driving drone, showing the practicality of the approach.

preprint2020arXiv

Episodic Koopman Learning of Nonlinear Robot Dynamics with Application to Fast Multirotor Landing

This paper presents a novel episodic method to learn a robot's nonlinear dynamics model and an increasingly optimal control sequence for a set of tasks. The method is based on the {\em Koopman operator} approach to nonlinear dynamical systems analysis, which models the flow of {\em observables} in a function space, rather than a flow in a state space. Practically, this method estimates a nonlinear diffeomorphism that lifts the dynamics to a higher dimensional space where they are linear. Efficient Model Predictive Control methods can then be applied to the lifted model. This approach allows for real time implementation in on-board hardware, with rigorous incorporation of both input and state constraints during learning. We demonstrate the method in a real-time implementation of fast multirotor landing, where the nonlinear ground effect is learned and used to improve landing speed and quality.

preprint2020arXiv

Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits

Optimizing lower-body exoskeleton walking gaits for user comfort requires understanding users' preferences over a high-dimensional gait parameter space. However, existing preference-based learning methods have only explored low-dimensional domains due to computational limitations. To learn user preferences in high dimensions, this work presents LineCoSpar, a human-in-the-loop preference-based framework that enables optimization over many parameters by iteratively exploring one-dimensional subspaces. Additionally, this work identifies gait attributes that characterize broader preferences across users. In simulations and human trials, we empirically verify that LineCoSpar is a sample-efficient approach for high-dimensional preference optimization. Our analysis of the experimental data reveals a correspondence between human preferences and objective measures of dynamicity, while also highlighting differences in the utility functions underlying individual users' gait preferences. This result has implications for exoskeleton gait synthesis, an active field with applications to clinical use and patient rehabilitation.

preprint2020arXiv

Stagewise Safe Bayesian Optimization with Gaussian Processes

Enforcing safety is a key aspect of many problems pertaining to sequential decision making under uncertainty, which require the decisions made at every step to be both informative of the optimal decision and also safe. For example, we value both efficacy and comfort in medical therapy, and efficiency and safety in robotic control. We consider this problem of optimizing an unknown utility function with absolute feedback or preference feedback subject to unknown safety constraints. We develop an efficient safe Bayesian optimization algorithm, StageOpt, that separates safe region expansion and utility function maximization into two distinct stages. Compared to existing approaches which interleave between expansion and optimization, we show that StageOpt is more efficient and naturally applicable to a broader class of problems. We provide theoretical guarantees for both the satisfaction of safety constraints as well as convergence to the optimal utility value. We evaluate StageOpt on both a variety of synthetic experiments, as well as in clinical practice. We demonstrate that StageOpt is more effective than existing safe optimization approaches, and is able to safely and effectively optimize spinal cord stimulation therapy in our clinical experiments.

preprint2020arXiv

Stochastic Finite State Control of POMDPs with LTL Specifications

Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such that the probability of satisfying a set of high-level specifications in terms of linear temporal logic (LTL) formulae is maximized. We begin by casting the latter problem into an optimization and use relaxations based on the Poisson equation and McCormick envelopes. Then, we propose an stochastic bounded policy iteration algorithm, leading to a controlled growth in sFSC size and an any time algorithm, where the performance of the controller improves with successive iterations, but can be stopped by the user based on time or memory considerations. We illustrate the proposed method by a robot navigation case study.

preprint2020arXiv

Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The objective of this work is to estimate the current state of the surgical task based on the actions performed or events occurred as the task progresses. We propose Fusion-KVE, a unified surgical state estimation model that incorporates multiple data sources including the Kinematics, Vision, and system Events. Additionally, we examine the strengths and weaknesses of different state estimation models in segmenting states with different representative features or levels of granularity. We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), as well as a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging, created using the da Vinci Xi surgical system. Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models in both JIGSAWS suturing dataset and our RIOUS dataset.

preprint2015arXiv

Suboptimal Stabilizing Controllers for Linearly Solvable System

This paper presents a novel method to synthesize stochastic control Lyapunov functions for a class of nonlinear, stochastic control systems. In this work, the classical nonlinear Hamilton-Jacobi-Bellman partial differential equation is transformed into a linear partial differential equation for a class of systems with a particular constraint on the stochastic disturbance. It is shown that this linear partial differential equation can be relaxed to a linear differential inclusion, allowing for approximating polynomial solutions to be generated using sum of squares programming. It is shown that the resulting solutions are stochastic control Lyapunov functions with a number of compelling properties. In particular, a-priori bounds on trajectory suboptimality are shown for these approximate value functions. The result is a technique whereby approximate solutions may be computed with non-increasing error via a hierarchy of semidefinite optimization problems.

preprint2014arXiv

Convex Model Predictive Control for Vehicular Systems

In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second order cone- or semidefinite-constraints on state variables are the only requirement beyond those of a QP-scheme typical for MPC of linear systems. Of particular emphasis is the application to aeronautical and vehicular systems, wherein the method removes many of the transcendental trigonometric terms associated with these systems' state space equations. Furthermore, the method is shown to be compatible with many existing variants of MPC, including obstacle avoidance via Mixed Integer Linear Programming (MILP).

preprint2014arXiv

Convex Relaxations of SE(2) and SE(3) for Visual Pose Estimation

This paper proposes a new method for rigid body pose estimation based on spectrahedral representations of the tautological orbitopes of $SE(2)$ and $SE(3)$. The approach can use dense point cloud data from stereo vision or an RGB-D sensor (such as the Microsoft Kinect), as well as visual appearance data. The method is a convex relaxation of the classical pose estimation problem, and is based on explicit linear matrix inequality (LMI) representations for the convex hulls of $SE(2)$ and $SE(3)$. Given these representations, the relaxed pose estimation problem can be framed as a robust least squares problem with the optimization variable constrained to these convex sets. Although this formulation is a relaxation of the original problem, numerical experiments indicate that it is indeed exact - i.e. its solution is a member of $SE(2)$ or $SE(3)$ - in many interesting settings. We additionally show that this method is guaranteed to be exact for a large class of pose estimation problems.

preprint2014arXiv

Domain Decomposition for Stochastic Optimal Control

This work proposes a method for solving linear stochastic optimal control (SOC) problems using sum of squares and semidefinite programming. Previous work had used polynomial optimization to approximate the value function, requiring a high polynomial degree to capture local phenomena. To improve the scalability of the method to problems of interest, a domain decomposition scheme is presented. By using local approximations, lower degree polynomials become sufficient, and both local and global properties of the value function are captured. The domain of the problem is split into a non-overlapping partition, with added constraints ensuring $C^1$ continuity. The Alternating Direction Method of Multipliers (ADMM) is used to optimize over each domain in parallel and ensure convergence on the boundaries of the partitions. This results in improved conditioning of the problem and allows for much larger and more complex problems to be addressed with improved performance.

preprint2014arXiv

Linear Hamilton Jacobi Bellman Equations in High Dimensions

The Hamilton Jacobi Bellman Equation (HJB) provides the globally optimal solution to large classes of control problems. Unfortunately, this generality comes at a price, the calculation of such solutions is typically intractible for systems with more than moderate state space size due to the curse of dimensionality. This work combines recent results in the structure of the HJB, and its reduction to a linear Partial Differential Equation (PDE), with methods based on low rank tensor representations, known as a separated representations, to address the curse of dimensionality. The result is an algorithm to solve optimal control problems which scales linearly with the number of states in a system, and is applicable to systems that are nonlinear with stochastic forcing in finite-horizon, average cost, and first-exit settings. The method is demonstrated on inverted pendulum, VTOL aircraft, and quadcopter models, with system dimension two, six, and twelve respectively.

preprint2014arXiv

Optimal Navigation Functions for Nonlinear Stochastic Systems

This paper presents a new methodology to craft navigation functions for nonlinear systems with stochastic uncertainty. The method relies on the transformation of the Hamilton-Jacobi-Bellman (HJB) equation into a linear partial differential equation. This approach allows for optimality criteria to be incorporated into the navigation function, and generalizes several existing results in navigation functions. It is shown that the HJB and that existing navigation functions in the literature sit on ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. In particular, it is shown that under certain criteria the optimal navigation function is related to Laplace's equation, previously used in the literature, through an exponential transform. Further, analytical solutions to the HJB are available in simplified domains, yielding guidance towards optimality for approximation schemes. Examples are used to illustrate the role that noise, and optimality can potentially play in navigation system design.

preprint2014arXiv

Semidefinite Relaxations for Stochastic Optimal Control Policies

Recent results in the study of the Hamilton Jacobi Bellman (HJB) equation have led to the discovery of a formulation of the value function as a linear Partial Differential Equation (PDE) for stochastic nonlinear systems with a mild constraint on their disturbances. This has yielded promising directions for research in the planning and control of nonlinear systems. This work proposes a new method obtaining approximate solutions to these linear stochastic optimal control (SOC) problems. A candidate polynomial with variable coefficients is proposed as the solution to the SOC problem. A Sum of Squares (SOS) relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function.

Joel W. Burdick

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Distributionally Robust Model Predictive Control with Total Variation Distance

Moving Obstacle Avoidance: a Data-Driven Risk-Aware Approach

Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

Learning Invariant Representation of Tasks for Robust Surgical State Estimation

ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes

Barrier Functions for Multiagent-POMDPs with DTL Specifications

Dueling Posterior Sampling for Preference-Based Reinforcement Learning

Energy-Efficient Motion Planning for Multi-Modal Hybrid Locomotion

Episodic Koopman Learning of Nonlinear Robot Dynamics with Application to Fast Multirotor Landing

Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits

Stagewise Safe Bayesian Optimization with Gaussian Processes

Stochastic Finite State Control of POMDPs with LTL Specifications

Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

Suboptimal Stabilizing Controllers for Linearly Solvable System

Convex Model Predictive Control for Vehicular Systems

Convex Relaxations of SE(2) and SE(3) for Visual Pose Estimation

Domain Decomposition for Stochastic Optimal Control

Linear Hamilton Jacobi Bellman Equations in High Dimensions

Optimal Navigation Functions for Nonlinear Stochastic Systems

Semidefinite Relaxations for Stochastic Optimal Control Policies