Researcher profile

Michael Burke

Michael Burke contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2025arXiv

Explaining Why Things Go Where They Go: Interpretable Constructs of Human Organizational Preferences

Robotic systems for household object rearrangement often rely on latent preference models inferred from human demonstrations. While effective at prediction, these models offer limited insight into the interpretable factors that guide human decisions. We introduce an explicit formulation of object arrangement preferences along four interpretable constructs: spatial practicality (putting items where they naturally fit best in the space), habitual convenience (making frequently used items easy to reach), semantic coherence (placing items together if they are used for the same task or are contextually related), and commonsense appropriateness (putting things where people would usually expect to find them). To capture these constructs, we designed and validated a self-report questionnaire through a 63-participant online study. Results confirm the psychological distinctiveness of these constructs and their explanatory power across two scenarios (kitchen and living room). We demonstrate the utility of these constructs by integrating them into a Monte Carlo Tree Search (MCTS) planner and show that when guided by participant-derived preferences, our planner can generate reasonable arrangements that closely align with those generated by participants. This work contributes a compact, interpretable formulation of object arrangement preferences and a demonstration of how it can be operationalized for robot planning.

preprint2023arXiv

Predicting the structural colors of films of disordered photonic balls

Photonic balls are spheres tens of micrometers in diameter containing assemblies of nanoparticles or nanopores with a spacing comparable to the wavelength of light. When these nanoscale features are disordered, but still correlated, the photonic balls can show structural color with low angle-dependence. Their colors, combined with the ability to add them to a liquid formulation, make photonic balls a promising new type of pigment particle for paints, coatings, and other applications. However, it is challenging to predict the color of materials made from photonic balls, because the sphere geometry and multiple scattering must be accounted for. To address these challenges, we develop a multiscale modeling approach involving Monte Carlo simulations of multiple scattering at two different scales: we simulate multiple scattering and absorption within a photonic ball and then use the results to simulate multiple scattering and absorption in a film of photonic balls. After validating against experimental spectra, we use the model to show that films of photonic balls scatter light in fundamentally different ways than do homogeneous films of nanopores or nanoparticles, because of their increased surface area and refraction at the interfaces of the balls. Both effects tend to sharply reduce color saturation relative to a homogeneous nanostructured film. We show that saturated colors can be achieved by placing an absorber directly in the photonic balls and mitigating surface roughness. With these design rules, we show that photonic-ball films have an advantage over homogeneous nanostructured films: their colors are even less dependent on the angle.

preprint2022arXiv

Residual Learning from Demonstration: Adapting DMPs for Contact-rich Manipulation

Manipulation skills involving contact and friction are inherent to many robotics tasks. Using the class of motor primitives for peg-in-hole like insertions, we study how robots can learn such skills. Dynamic Movement Primitives (DMP) are a popular way of extracting such policies through behaviour cloning (BC) but can struggle in the context of insertion. Policy adaptation strategies such as residual learning can help improve the overall performance of policies in the context of contact-rich manipulation. However, it is not clear how to best do this with DMPs. As a result, we consider several possible ways for adapting a DMP formulation and propose ``residual Learning from Demonstration`` (rLfD), a framework that combines DMPs with Reinforcement Learning (RL) to learn a residual correction policy. Our evaluations suggest that applying residual learning directly in task space and operating on the full pose of the robot can significantly improve the overall performance of DMPs. We show that rLfD offers a gentle to the joints solution that improves the task success and generalisation of DMPs \rb{and enables transfer to different geometries and frictions through few-shot task adaptation}. The proposed framework is evaluated on a set of tasks. A simulated robot and a physical robot have to successfully insert pegs, gears and plugs into their respective sockets. Other material and videos accompanying this paper are provided at https://sites.google.com/view/rlfd/.

preprint2021arXiv

Action sequencing using visual permutations

Humans can easily reason about the sequence of high level actions needed to complete tasks, but it is particularly difficult to instil this ability in robots trained from relatively few examples. This work considers the task of neural action sequencing conditioned on a single reference visual state. This task is extremely challenging as it is not only subject to the significant combinatorial complexity that arises from large action sets, but also requires a model that can perform some form of symbol grounding, mapping high dimensional input data to actions, while reasoning about action relationships. This paper takes a permutation perspective and argues that action sequencing benefits from the ability to reason about both permutations and ordering concepts. Empirical analysis shows that neural models trained with latent permutations outperform standard neural architectures in constrained action sequencing tasks. Results also show that action sequencing using visual permutations is an effective mechanism to initialise and speed up traditional planning techniques and successfully scales to far greater action set sizes than models considered previously.

preprint2021arXiv

Learning Structured Representations of Spatial and Interactive Dynamics for Trajectory Prediction in Crowded Scenes

Context plays a significant role in the generation of motion for dynamic agents in interactive environments. This work proposes a modular method that utilises a learned model of the environment for motion prediction. This modularity explicitly allows for unsupervised adaptation of trajectory prediction models to unseen environments and new tasks by relying on unlabelled image data only. We model both the spatial and dynamic aspects of a given environment alongside the per agent motions. This results in more informed motion prediction and allows for performance comparable to the state-of-the-art. We highlight the model's prediction capability using a benchmark pedestrian prediction problem and a robot manipulation task and show that we can transfer the predictor across these tasks in a completely unsupervised way. The proposed approach allows for robust and label efficient forward modelling, and relaxes the need for full model re-training in new environments.

preprint2020arXiv

Black-Box Saliency Map Generation Using Bayesian Optimisation

Saliency maps are often used in computer vision to provide intuitive interpretations of what input regions a model has used to produce a specific prediction. A number of approaches to saliency map generation are available, but most require access to model parameters. This work proposes an approach for saliency map generation for black-box models, where no access to model parameters is available, using a Bayesian optimisation sampling method. The approach aims to find the global salient image region responsible for a particular (black-box) model's prediction. This is achieved by a sampling-based approach to model perturbations that seeks to localise salient regions of an image to the black-box model. Results show that the proposed approach to saliency map generation outperforms grid-based perturbation approaches, and performs similarly to gradient-based approaches which require access to model parameters.

preprint2020arXiv

Composing Diverse Policies for Temporally Extended Tasks

Robot control policies for temporally extended and sequenced tasks are often characterized by discontinuous switches between different local dynamics. These change-points are often exploited in hierarchical motion planning to build approximate models and to facilitate the design of local, region-specific controllers. However, it becomes combinatorially challenging to implement such a pipeline for complex temporally extended tasks, especially when the sub-controllers work on different information streams, time scales and action spaces. In this paper, we introduce a method that can compose diverse policies comprising motion planning trajectories, dynamic motion primitives and neural network controllers. We introduce a global goal scoring estimator that uses local, per-motion primitive dynamics models and corresponding activation state-space sets to sequence diverse policies in a locally optimal fashion. We use expert demonstrations to convert what is typically viewed as a gradient-based learning process into a planning process without explicitly specifying pre- and post-conditions. We first illustrate the proposed framework using an MDP benchmark to showcase robustness to action and model dynamics mismatch, and then with a particularly complex physical gear assembly task, solved on a PR2 robot. We show that the proposed approach successfully discovers the optimal sequence of controllers and solves both tasks efficiently.

preprint2020arXiv

Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video

We propose a model that is able to perform unsupervised physical parameter estimation of systems from video, where the differential equations governing the scene dynamics are known, but labeled states or objects are not available. Existing physical scene understanding methods require either object state supervision, or do not integrate with differentiable physics to learn interpretable system parameters and states. We address this problem through a physics-as-inverse-graphics approach that brings together vision-as-inverse-graphics and differentiable physics engines, enabling objects and explicit state and velocity representations to be discovered. This framework allows us to perform long term extrapolative video prediction, as well as vision-based model-predictive control. Our approach significantly outperforms related unsupervised methods in long-term future frame prediction of systems with interacting objects (such as ball-spring or 3-body gravitational systems), due to its ability to build dynamics into the model as an inductive bias. We further show the value of this tight vision-physics integration by demonstrating data-efficient learning of vision-actuated model-based control for a pendulum system. We also show that the controller's interpretability provides unique capabilities in goal-driven control and physical reasoning for zero-data adaptation.

preprint2020arXiv

Rapid Probabilistic Interest Learning from Domain-Specific Pairwise Image Comparisons

A great deal of work aims to discover large general purpose models of image interest or memorability for visual search and information retrieval. This paper argues that image interest is often domain and user specific, and that efficient mechanisms for learning about this domain-specific image interest as quickly as possible, while limiting the amount of data-labelling required, are often more useful to end-users. This work uses pairwise image comparisons to reduce the labelling burden on these users, and introduces an image interest estimation approach that performs similarly to recent data hungry deep learning approaches trained using pairwise ranking losses. Here, we use a Gaussian process model to interpolate image interest inferred using a Bayesian ranking approach over image features extracted using a pre-trained convolutional neural network. Results show that fitting a Gaussian process in high-dimensional image feature space is not only computationally feasible, but also effective across a broad range of domains. The proposed probabilistic interest estimation approach produces image interests paired with uncertainties that can be used to identify images for which additional labelling is required and measure inference convergence, allowing for sample efficient active model training. Importantly, the probabilistic formulation allows for effective visual search and information retrieval when limited labelling data is available.

preprint2020arXiv

Vid2Param: Modelling of Dynamics Parameters from Video

Videos provide a rich source of information, but it is generally hard to extract dynamical parameters of interest. Inferring those parameters from a video stream would be beneficial for physical reasoning. Robots performing tasks in dynamic environments would benefit greatly from understanding the underlying environment motion, in order to make future predictions and to synthesize effective control policies that use this inductive bias. Online physical reasoning is therefore a fundamental requirement for robust autonomous agents. When the dynamics involves multiple modes (due to contacts or interactions between objects) and sensing must proceed directly from a rich sensory stream such as video, then traditional methods for system identification may not be well suited. We propose an approach wherein fast parameter estimation can be achieved directly from video. We integrate a physically based dynamics model with a recurrent variational autoencoder, by introducing an additional loss to enforce desired constraints. The model, which we call Vid2Param, can be trained entirely in simulation, in an end-to-end manner with domain randomization, to perform online system identification, and make probabilistic forward predictions of parameters of interest. This enables the resulting model to encode parameters such as position, velocity, restitution, air drag and other physical properties of the system. We illustrate the utility of this in physical experiments wherein a PR2 robot with a velocity constrained arm must intercept an unknown bouncing ball with partly occluded vision, by estimating the physical parameters of this ball directly from the video trace after the ball is released.