Source author record

David Barber

David Barber appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Methodology Numerical Analysis Systems and Control Artificial Intelligence Computation Computational Complexity Computational Engineering, Finance, and Science Discrete Mathematics eess.IV math.NA

Catalog footprint

What is connected

14works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Improving VAE-based Representation Learning

Latent variable models like the Variational Auto-Encoder (VAE) are commonly used to learn representations of images. However, for downstream tasks like semantic classification, the representations learned by VAE are less competitive than other non-latent variable models. This has led to some speculations that latent variable models may be fundamentally unsuitable for representation learning. In this work, we study what properties are required for good representations and how different VAE structure choices could affect the learned properties. We show that by using a decoder that prefers to learn local features, the remaining global features can be well captured by the latent, which significantly improves performance of a downstream classification task. We further apply the proposed model to semi-supervised learning tasks and demonstrate improvements in data efficiency.

preprint2022arXiv

Integrated Weak Learning

We introduce Integrated Weak Learning, a principled framework that integrates weak supervision into the training process of machine learning models. Our approach jointly trains the end-model and a label model that aggregates multiple sources of weak supervision. We introduce a label model that can learn to aggregate weak supervision sources differently for different datapoints and takes into consideration the performance of the end-model during training. We show that our approach outperforms existing weak learning techniques across a set of 6 benchmark classification datasets. When both a small amount of labeled data and weak supervision are present the increase in performance is both consistent and large, reliably getting a 2-5 point test F1 score gain over non-integrated methods.

preprint2022arXiv

Parallel Neural Local Lossless Compression

The recently proposed Neural Local Lossless Compression (NeLLoC), which is based on a local autoregressive model, has achieved state-of-the-art (SOTA) out-of-distribution (OOD) generalization performance in the image compression task. In addition to the encouragement of OOD generalization, the local model also allows parallel inference in the decoding stage. In this paper, we propose two parallelization schemes for local autoregressive models. We discuss the practicalities of implementing the schemes and provide experimental evidence of significant gains in compression runtime compared to the previous, non-parallel implementation.

preprint2022arXiv

Representation Learning for High-Dimensional Data Collection under Local Differential Privacy

The collection of individuals' data has become commonplace in many industries. Local differential privacy (LDP) offers a rigorous approach to preserving privacy whereby the individual privatises their data locally, allowing only their perturbed datum to leave their possession. LDP thus provides a provable privacy guarantee to the individual against both adversaries and database administrators. Existing LDP mechanisms have successfully been applied to low-dimensional data, but in high dimensions the privacy-inducing noise largely destroys the utility of the data. In this work, our contributions are two-fold: first, by adapting state-of-the-art techniques from representation learning, we introduce a novel approach to learning LDP mechanisms. These mechanisms add noise to powerful representations on the low-dimensional manifold underlying the data, thereby overcoming the prohibitive noise requirements of LDP in high dimensions. Second, we introduce a novel denoising approach for downstream model learning. The training of performant machine learning models using collected LDP data is a common goal for data collectors, and downstream model performance forms a proxy for the LDP data utility. Our approach significantly outperforms current state-of-the-art LDP mechanisms.

preprint2020arXiv

Private Machine Learning via Randomised Response

We introduce a general learning framework for private machine learning based on randomised response. Our assumption is that all actors are potentially adversarial and as such we trust only to release a single noisy version of an individual's datapoint. We discuss a general approach that forms a consistent way to estimate the true underlying machine learning model and demonstrate this in the case of logistic regression.

preprint2016arXiv

Dealing with a large number of classes -- Likelihood, Discrimination or Ranking?

We consider training probabilistic classifiers in the case of a large number of classes. The number of classes is assumed too large to perform exact normalisation over all classes. To account for this we consider a simple approach that directly approximates the likelihood. We show that this simple approach works well on toy problems and is competitive with recently introduced alternative non-likelihood based approximations. Furthermore, we relate this approach to a simple ranking objective. This leads us to suggest a specific setting for the optimal threshold in the ranking objective.

preprint2016arXiv

Nesterov's Accelerated Gradient and Momentum as approximations to Regularised Update Descent

We present a unifying framework for adapting the update direction in gradient-based iterative optimization methods. As natural special cases we re-derive classical momentum and Nesterov's accelerated gradient method, lending a new intuitive interpretation to the latter algorithm. We show that a new algorithm, which we term Regularised Gradient Descent, can converge more quickly than either Nesterov's algorithm or the classical momentum algorithm.

preprint2015arXiv

An Approximate Newton Method for Markov Decision Processes

Gradient-based algorithms are one of the methods of choice for the optimisation of Markov Decision Processes. In this article we will present a novel approximate Newton algorithm for the optimisation of such models. The algorithm has various desirable properties over the naive application of Newton's method. Firstly the approximate Hessian is guaranteed to be negative-semidefinite over the entire parameter space in the case where the controller is log-concave in the control parameters. Additionally the inference required for our approximate Newton method is often the same as that required for first order methods, such as steepest gradient ascent. The approximate Hessian also has many nice sparsity properties that are not present in the Hessian and that make its inversion efficient in many situations of interest. We also provide an analysis that highlights a relationship between our approximate Newton method and both Expectation Maximisation and natural gradient ascent. Empirical results suggest that the algorithm has excellent convergence and robustness properties.

preprint2014arXiv

On solving Ordinary Differential Equations using Gaussian Processes

We describe a set of Gaussian Process based approaches that can be used to solve non-linear Ordinary Differential Equations. We suggest an explicit probabilistic solver and two implicit methods, one analogous to Picard iteration and the other to gradient matching. All methods have greater accuracy than previously suggested Gaussian Process approaches. We also suggest a general approach that can yield error estimates from any standard ODE solver.

preprint2012arXiv

Bayesian Conditional Cointegration

Cointegration is an important topic for time-series, and describes a relationship between two series in which a linear combination is stationary. Classically, the test for cointegration is based on a two stage process in which first the linear relation between the series is estimated by Ordinary Least Squares. Subsequently a unit root test is performed on the residuals. A well-known deficiency of this classical approach is that it can lead to erroneous conclusions about the presence of cointegration. As an alternative, we present a framework for estimating whether cointegration exists using Bayesian inference which is empirically superior to the classical approach. Finally, we apply our technique to model segmented cointegration in which cointegration may exist only for limited time. In contrast to previous approaches our model makes no restriction on the number of possible cointegration segments.

preprint2012arXiv

Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices

We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, de ned as well-connected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. Inference is carried out using a variational approximation. Clique matrices also play a natural role in parameterising positive de nite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive de nite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the non-decomposable case.

preprint2012arXiv

Efficient Inference in Markov Control Problems

Markov control algorithms that perform smooth, non-greedy updates of the policy have been shown to be very general and versatile, with policy gradient and Expectation Maximisation algorithms being particularly popular. For these algorithms, marginal inference of the reward weighted trajectory distribution is required to perform policy updates. We discuss a new exact inference algorithm for these marginals in the finite horizon case that is more efficient than the standard approach based on classical forward-backward recursions. We also provide a principled extension to infinite horizon Markov Decision Problems that explicitly accounts for an infinite horizon. This extension provides a novel algorithm for both policy gradients and Expectation Maximisation in infinite horizon problems.

preprint2012arXiv

On the Computational Complexity of Stochastic Controller Optimization in POMDPs

We show that the problem of finding an optimal stochastic 'blind' controller in a Markov decision process is an NP-hard problem. The corresponding decision problem is NP-hard, in PSPACE, and SQRT-SUM-hard, hence placing it in NP would imply breakthroughs in long-standing open problems in computer science. Our result establishes that the more general problem of stochastic controller optimization in POMDPs is also NP-hard. Nonetheless, we outline a special case that is convex and admits efficient global solutions.

preprint2012arXiv

Variational Optimization

We discuss a general technique that can be used to form a differentiable bound on the optima of non-differentiable or discrete objective functions. We form a unified description of these methods and consider under which circumstances the bound is concave. In particular we consider two concrete applications of the method, namely sparse learning and support vector classification.

David Barber

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Improving VAE-based Representation Learning

Integrated Weak Learning

Parallel Neural Local Lossless Compression

Representation Learning for High-Dimensional Data Collection under Local Differential Privacy

Private Machine Learning via Randomised Response

Dealing with a large number of classes -- Likelihood, Discrimination or Ranking?

Nesterov's Accelerated Gradient and Momentum as approximations to Regularised Update Descent

An Approximate Newton Method for Markov Decision Processes

On solving Ordinary Differential Equations using Gaussian Processes

Bayesian Conditional Cointegration

Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices

Efficient Inference in Markov Control Problems

On the Computational Complexity of Stochastic Controller Optimization in POMDPs

Variational Optimization