Researcher profile

Swapnil Mishra

Swapnil Mishra contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

$π$VAE: a stochastic process prior for Bayesian deep learning with MCMC

Stochastic processes provide a mathematically elegant way model complex data. In theory, they provide flexible priors over function classes that can encode a wide range of interesting assumptions. In practice, however, efficient inference by optimisation or marginalisation is difficult, a problem further exacerbated with big data and high dimensional input spaces. We propose a novel variational autoencoder (VAE) called the prior encoding variational autoencoder ($π$VAE). The $π$VAE is finitely exchangeable and Kolmogorov consistent, and thus is a continuous stochastic process. We use $π$VAE to learn low dimensional embeddings of function classes. We show that our framework can accurately learn expressive function classes such as Gaussian processes, but also properties of functions to enable statistical inference (such as the integral of a log Gaussian process). For popular tasks, such as spatial interpolation, $π$VAE achieves state-of-the-art performance both in terms of accuracy and computational efficiency. Perhaps most usefully, we demonstrate that the low dimensional independently distributed latent space representation learnt provides an elegant and scalable means of performing Bayesian inference for stochastic processes within probabilistic programming languages such as Stan.

preprint2021arXiv

Referenced Thermodynamic Integration for Bayesian Model Selection: Application to COVID-19 Model Selection

Model selection is a fundamental part of the applied Bayesian statistical methodology. Metrics such as the Akaike Information Criterion are commonly used in practice to select models but do not incorporate the uncertainty of the models' parameters and can give misleading choices. One approach that uses the full posterior distribution is to compute the ratio of two models' normalising constants, known as the Bayes factor. Often in realistic problems, this involves the integration of analytically intractable, high-dimensional distributions, and therefore requires the use of stochastic methods such as thermodynamic integration (TI). In this paper we apply a variation of the TI method, referred to as referenced TI, which computes a single model's normalising constant in an efficient way by using a judiciously chosen reference density. The advantages of the approach and theoretical considerations are set out, along with explicit pedagogical 1 and 2D examples. Benchmarking is presented with comparable methods and we find favourable convergence performance. The approach is shown to be useful in practice when applied to a real problem - to perform model selection for a semi-mechanistic hierarchical Bayesian model of COVID-19 transmission in South Korea involving the integration of a 200D density.

preprint2020arXiv

A unified machine learning approach to time series forecasting applied to demand at emergency departments

There were 25.6 million attendances at Emergency Departments (EDs) in England in 2019 corresponding to an increase of 12 million attendances over the past ten years. The steadily rising demand at EDs creates a constant challenge to provide adequate quality of care while maintaining standards and productivity. Managing hospital demand effectively requires an adequate knowledge of the future rate of admission. Using 8 years of electronic admissions data from two major acute care hospitals in London, we develop a novel ensemble methodology that combines the outcomes of the best performing time series and machine learning approaches in order to make highly accurate forecasts of demand, 1, 3 and 7 days in the future. Both hospitals face an average daily demand of 208 and 106 attendances respectively and experience considerable volatility around this mean. However, our approach is able to predict attendances at these emergency departments one day in advance up to a mean absolute error of +/- 14 and +/- 10 patients corresponding to a mean absolute percentage error of 6.8% and 8.6% respectively. Our analysis compares machine learning algorithms to more traditional linear models. We find that linear models often outperform machine learning methods and that the quality of our predictions for any of the forecasting horizons of 1, 3 or 7 days are comparable as measured in MAE. In addition to comparing and combining state-of-the-art forecasting methods to predict hospital demand, we consider two different hyperparameter tuning methods, enabling a faster deployment of our models without compromising performance. We believe our framework can readily be used to forecast a wide range of policy relevant indicators.

preprint2020arXiv

Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in European countries: technical description update

Following the emergence of a novel coronavirus (SARS-CoV-2) and its spread outside of China, Europe has experienced large epidemics. In response, many European countries have implemented unprecedented non-pharmaceutical interventions including case isolation, the closure of schools and universities, banning of mass gatherings and/or public events, and most recently, wide-scale social distancing including local and national lockdowns. In this technical update, we extend a semi-mechanistic Bayesian hierarchical model that infers the impact of these interventions and estimates the number of infections over time. Our methods assume that changes in the reproductive number - a measure of transmission - are an immediate response to these interventions being implemented rather than broader gradual changes in behaviour. Our model estimates these changes by calculating backwards from temporal data on observed to estimate the number of infections and rate of transmission that occurred several weeks prior, allowing for a probabilistic time lag between infection and death. In this update we extend our original model [Flaxman, Mishra, Gandy et al 2020, Report #13, Imperial College London] to include (a) population saturation effects, (b) prior uncertainty on the infection fatality ratio, (c) a more balanced prior on intervention effects and (d) partial pooling of the lockdown intervention covariate. We also (e) included another 3 countries (Greece, the Netherlands and Portugal). The model code is available at https://github.com/ImperialCollegeLondon/covid19model/ We are now reporting the results of our updated model online at https://mrc-ide.github.io/covid19estimates/ We estimated parameters jointly for all M=14 countries in a single hierarchical model. Inference is performed in the probabilistic programming language Stan using an adaptive Hamiltonian Monte Carlo (HMC) sampler.

preprint2020arXiv

Inference of COVID-19 epidemiological distributions from Brazilian hospital data

Knowing COVID-19 epidemiological distributions, such as the time from patient admission to death, is directly relevant to effective primary and secondary care planning, and moreover, the mathematical modelling of the pandemic generally. We determine epidemiological distributions for patients hospitalised with COVID-19 using a large dataset ($N=21{,}000-157{,}000$) from the Brazilian Sistema de Informação de Vigilância Epidemiológica da Gripe database. A joint Bayesian subnational model with partial pooling is used to simultaneously describe the 26 states and one federal district of Brazil, and shows significant variation in the mean of the symptom-onset-to-death time, with ranges between 11.2-17.8 days across the different states, and a mean of 15.2 days for Brazil. We find strong evidence in favour of specific probability density function choices: for example, the gamma distribution gives the best fit for onset-to-death and the generalised log-normal for onset-to-hospital-admission. Our results show that epidemiological distributions have considerable geographical variation, and provide the first estimates of these distributions in a low and middle-income setting. At the subnational level, variation in COVID-19 outcome timings are found to be correlated with poverty, deprivation and segregation levels, and weaker correlation is observed for mean age, wealth and urbanicity.

preprint2020arXiv

On the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective

Renewal processes are a popular approach used in modelling infectious disease outbreaks. In a renewal process, previous infections give rise to future infections. However, while this formulation seems sensible, its application to infectious disease can be difficult to justify from first principles. It has been shown from the seminal work of Bellman and Harris that the renewal equation arises as the expectation of an age-dependent branching process. In this paper we provide a detailed derivation of the original Bellman Harris process. We introduce generalisations, that allow for time-varying reproduction numbers and the accounting of exogenous events, such as importations. We show how inference on the renewal equation is easy to accomplish within a Bayesian hierarchical framework. Using off the shelf MCMC packages, we fit to South Korea COVID-19 case data to estimate reproduction numbers and importations. Our derivation provides the mathematical fundamentals and assumptions underpinning the use of the renewal equation for modelling outbreaks.

preprint2020arXiv

Semi-Mechanistic Bayesian Modeling of COVID-19 with Renewal Processes

We propose a general Bayesian approach to modeling epidemics such as COVID-19. The approach grew out of specific analyses conducted during the pandemic, in particular an analysis concerning the effects of non-pharmaceutical interventions (NPIs) in reducing COVID-19 transmission in 11 European countries. The model parameterizes the time varying reproduction number $R_t$ through a regression framework in which covariates can e.g be governmental interventions or changes in mobility patterns. This allows a joint fit across regions and partial pooling to share strength. This innovation was critical to our timely estimates of the impact of lockdown and other NPIs in the European epidemics, whose validity was borne out by the subsequent course of the epidemic. Our framework provides a fully generative model for latent infections and observations deriving from them, including deaths, cases, hospitalizations, ICU admissions and seroprevalence surveys. One issue surrounding our model's use during the COVID-19 pandemic is the confounded nature of NPIs and mobility. We use our framework to explore this issue. We have open sourced an R package epidemia implementing our approach in Stan. Versions of the model are used by New York State, Tennessee and Scotland to estimate the current situation and make policy decisions.