Researcher profile

Roderick Murray-Smith

Roderick Murray-Smith contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Survey: Leakage and Privacy at Inference Time

Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance as commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts, covering involuntary data leakage which is natural to ML models, potential malevolent leakage which is caused by privacy attacks, and currently available defence mechanisms. We focus on inference-time leakage, as the most likely scenario for publicly available models. We first discuss what leakage is in the context of different data, tasks, and model architectures. We then propose a taxonomy across involuntary and malevolent leakage, available defences, followed by the currently available assessment metrics and applications. We conclude with outstanding challenges and open questions, outlining some promising directions for future research.

preprint2021arXiv

Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy

Gravitational wave (GW) detection is now commonplace and as the sensitivity of the global network of GW detectors improves, we will observe $\mathcal{O}(100)$s of transient GW events per year. The current methods used to estimate their source parameters employ optimally sensitive but computationally costly Bayesian inference approaches where typical analyses have taken between 6 hours and 5 days. For binary neutron star and neutron star black hole systems prompt counterpart electromagnetic (EM) signatures are expected on timescales of 1 second -- 1 minute and the current fastest method for alerting EM follow-up observers, can provide estimates in $\mathcal{O}(1)$ minute, on a limited range of key source parameters. Here we show that a conditional variational autoencoder pre-trained on binary black hole signals can return Bayesian posterior probability estimates. The training procedure need only be performed once for a given prior parameter space and the resulting trained machine can then generate samples describing the posterior distribution $\sim 6$ orders of magnitude faster than existing techniques.

preprint2020arXiv

Kemeny-based testing for COVID-19

Testing, tracking and tracing abilities have been identified as pivotal in helping countries to safely reopen activities after the first wave of the COVID-19 virus. Contact tracing apps give the unprecedented possibility to reconstruct graphs of daily contacts, so the question is who should be tested? As human contact networks are known to exhibit community structure, in this paper we show that the Kemeny constant of a graph can be used to identify and analyze bridges between communities in a graph. Our "Kemeny indicator" is the change in Kemeny constant when a node or edge is removed from the graph. We show that testing individuals who are associated with large values of the Kemeny indicator can help in efficiently intercepting new virus outbreaks, when they are still in their early stage. Extensive simulations provide promising results in early identification and in blocking possible "super-spreaders" links that transmit disease between different communities.

preprint2020arXiv

Learning a low dimensional manifold of real cancer tissue with PathologyGAN

Application of deep learning in digital pathology shows promise on improving disease diagnosis and understanding. We present a deep generative model that learns to simulate high-fidelity cancer tissue images while mapping the real images onto an interpretable low dimensional latent space. The key to the model is an encoder trained by a previously developed generative adversarial network, PathologyGAN. We study the latent space using 249K images from two breast cancer cohorts. We find that the latent space encodes morphological characteristics of tissues (e.g. patterns of cancer, lymphocytes, and stromal cells). In addition, the latent space reveals distinctly enriched clusters of tissue architectures in the high-risk patient group.

preprint2020arXiv

Spatial images from temporal data

Traditional paradigms for imaging rely on the use of a spatial structure, either in the detector (pixels arrays) or in the illumination (patterned light). Removal of the spatial structure in the detector or illumination, i.e., imaging with just a single-point sensor, would require solving a very strongly ill-posed inverse retrieval problem that to date has not been solved. Here, we demonstrate a data-driven approach in which full 3D information is obtained with just a single-point, single-photon avalanche diode that records the arrival time of photons reflected from a scene that is illuminated with short pulses of light. Imaging with single-point time-of-flight (temporal) data opens new routes in terms of speed, size, and functionality. As an example, we show how the training based on an optical time-of-flight camera enables a compact radio-frequency impulse radio detection and ranging transceiver to provide 3D images.

preprint2020arXiv

Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data

We propose a new probabilistic method for unsupervised recovery of corrupted data. Given a large ensemble of degraded samples, our method recovers accurate posteriors of clean values, allowing the exploration of the manifold of possible reconstructed data and hence characterising the underlying uncertainty. In this setting, direct application of classical variational methods often gives rise to collapsed densities that do not adequately explore the solution space. Instead, we derive our novel reduced entropy condition approximate inference method that results in rich posteriors. We test our model in a data recovery task under the common setting of missing values and noise, demonstrating superior performance to existing variational methods for imputation and de-noising with different real data sets. We further show higher classification accuracy after imputation, proving the advantage of propagating uncertainty to downstream tasks with our model.

preprint2020arXiv

Variational Inference for Computational Imaging Inverse Problems

Machine learning methods for computational imaging require uncertainty estimation to be reliable in real settings. While Bayesian models offer a computationally tractable way of recovering uncertainty, they need large data volumes to be trained, which in imaging applications implicates prohibitively expensive collections with specific imaging instruments. This paper introduces a novel framework to train variational inference for inverse problems exploiting in combination few experimentally collected data, domain expertise and existing image data sets. In such a way, Bayesian machine learning models can solve imaging inverse problems with minimal data collection efforts. Extensive simulated experiments show the advantages of the proposed framework. The approach is then applied to two real experimental optics settings: holographic image reconstruction and imaging through highly scattering media. In both settings, state of the art reconstructions are achieved with little collection of training data.