Researcher profile

Todd D. Murphey

Todd D. Murphey contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Learning from Human Directional Corrections

This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections -- corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot's current motion relative to an unknown objective function. The allowable corrections satisfying this assumption account for half of the input space, as opposed to the magnitude corrections which have to lie in a shrinking level set. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment. The results confirm the convergence of the proposed method and further show that the method is significantly more effective (higher success rate), efficient/effortless (less human corrections needed), and potentially more accessible (fewer early wasted trials) than the state-of-the-art robot learning frameworks.

preprint2022arXiv

Learning from Sparse Demonstrations

This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the robot's actual execution. The method jointly finds an objective function and a time-warping function such that the robot's resulting trajectory sequentially follows the keyframes with minimal discrepancy loss. The Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of the robot trajectory with respect to the unknown parameters. The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments. The results show the efficiency of the method, its ability to handle time misalignment between keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.

preprint2022arXiv

Learning Stable Models for Prediction and Control

This paper demonstrates the benefits of imposing stability on data-driven Koopman operators. The data-driven identification of stable Koopman operators (DISKO) is implemented using an algorithm \cite{mamakoukas_stableLDS2020} that computes the nearest \textit{stable} matrix solution to a least-squares reconstruction error. As a first result, we derive a formula that describes the prediction error of Koopman representations for an arbitrary number of time steps, and which shows that stability constraints can improve the predictive accuracy over long horizons. As a second result, we determine formal conditions on basis functions of Koopman operators needed to satisfy the stability properties of an underlying nonlinear system. As a third result, we derive formal conditions for constructing Lyapunov functions for nonlinear systems out of stable data-driven Koopman operators, which we use to verify stabilizing control from data. Lastly, we demonstrate the benefits of DISKO in prediction and control with simulations using a pendulum and a quadrotor and experiments with a pusher-slider system. The paper is complemented with a video: \url{https://sites.google.com/view/learning-stable-koopman}.

preprint2020arXiv

Algorithmic Design for Embodied Intelligence in Synthetic Cells

In nature, biological organisms jointly evolve both their morphology and their neurological capabilities to improve their chances for survival. Consequently, task information is encoded in both their brains and their bodies. In robotics, the development of complex control and planning algorithms often bears sole responsibility for improving task performance. This dependence on centralized control can be problematic for systems with computational limitations, such as mechanical systems and robots on the microscale. In these cases we need to be able to offload complex computation onto the physical morphology of the system. To this end, we introduce a methodology for algorithmically arranging sensing and actuation components into a robot design while maintaining a low level of design complexity (quantified using a measure of graph entropy), and a high level of task embodiment (evaluated by analyzing the Kullback-Leibler divergence between physical executions of the robot and those of an idealized system). This approach computes an idealized, unconstrained control policy which is projected onto a limited selection of sensors and actuators in a given library, resulting in intelligence that is distributed away from a central processor and instead embodied in the physical body of a robot. The method is demonstrated by computationally optimizing a simulated synthetic cell.

preprint2020arXiv

Hybrid Control for Learning Motor Skills

We develop a hybrid control approach for robot learning based on combining learned predictive models with experience-based state-action policy mappings to improve the learning capabilities of robotic systems. Predictive models provide an understanding of the task and the physics (which improves sample-efficiency), while experience-based policy mappings are treated as "muscle memory" that encode favorable actions as experiences that override planned actions. Hybrid control tools are used to create an algorithmic approach for combining learned predictive models with experience-based learning. Hybrid learning is presented as a method for efficiently learning motor skills by systematically combining and improving the performance of predictive models and experience-based policies. A deterministic variation of hybrid learning is derived and extended into a stochastic implementation that relaxes some of the key assumptions in the original derivation. Each variation is tested on experience-based learning methods (where the robot interacts with the environment to gain experience) as well as imitation learning methods (where experience is provided through demonstrations and tested in the environment). The results show that our method is capable of improving the performance and sample-efficiency of learning motor skills in a variety of experimental domains.

preprint2020arXiv

Information Requirements of Collision-Based Micromanipulation

We present a task-centered formal analysis of the relative power of several robot designs, inspired by the unique properties and constraints of micro-scale robotic systems. Our task of interest is object manipulation because it is a fundamental prerequisite for more complex applications such as micro-scale assembly or cell manipulation. Motivated by the difficulty in observing and controlling agents at the micro-scale, we focus on the design of boundary interactions: the robot's motion strategy when it collides with objects or the environment boundary, otherwise known as a bounce rule. We present minimal conditions on the sensing, memory, and actuation requirements of periodic ``bouncing'' robot trajectories that move an object in a desired direction through the incidental forces arising from robot-object collisions. Using an information space framework and a hierarchical controller, we compare several robot designs, emphasizing the information requirements of goal completion under different initial conditions, as well as what is required to recognize irreparable task failure. Finally, we present a physically-motivated model of boundary interactions, and analyze the robustness and dynamical properties of resulting trajectories.

preprint2020arXiv

Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control

This work addresses the problem of robot interaction in complex environments where online control and adaptation is necessary. By expanding the sample space in the free energy formulation of path integral control, we derive a natural extension to the path integral control that embeds uncertainty into action and provides robustness for model-based robot planning. Our algorithm is applied to a diverse set of tasks using different robots and validate our results in simulation and real-world experiments. We further show that our method is capable of running in real-time without loss of performance. Videos of the experiments as well as additional implementation details can be found at https://sites.google.com/view/emppi.