Researcher profile

Robert Kohn

Robert Kohn contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2024arXiv

The Contextual Lasso: Sparse Linear Models via Deep Neural Networks

Sparse linear models are one of several core tools for interpretable machine learning, a field of emerging importance as predictive models permeate decision-making in many domains. Unfortunately, sparse linear models are far less flexible as functions of their input features than black-box models like deep neural networks. With this capability gap in mind, we study a not-uncommon situation where the input features dichotomize into two groups: explanatory features, which are candidates for inclusion as variables in an interpretable model, and contextual features, which select from the candidate variables and determine their effects. This dichotomy leads us to the contextual lasso, a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. The fitting process learns this function nonparametrically via a deep neural network. To attain sparse coefficients, we train the network with a novel lasso regularizer in the form of a projection layer that maps the network's output onto the space of $\ell_1$-constrained linear models. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso without sacrificing the predictive power of a standard deep neural network.

preprint2022arXiv

An energy minimization approach to twinning with variable volume fraction

In materials that undergo martensitic phase transformation, macroscopic loading often leads to the creation and/or rearrangement of elastic domains. This paper considers an example {involving} a single-crystal slab made from two martensite variants. When the slab is made to bend, the two variants form a characteristic microstructure that we like to call ``twinning with variable volume fraction.'' Two 1996 papers by Chopra et. al. explored this example using bars made from InTl, providing considerable detail about the microstructures they observed. Here we offer an energy-minimization-based model that is motivated by their account. It uses geometrically linear elasticity, and treats the phase boundaries as sharp interfaces. For simplicity, rather than model the experimental forces and boundary conditions exactly, we consider certain Dirichlet or Neumann boundary conditions whose effect is to require bending. This leads to certain nonlinear (and nonconvex) variational problems that represent the minimization of elastic plus surface energy (and the work done by the load, in the case of a Neumann boundary condition). Our results identify how the minimum value of each variational problem scales with respect to the surface energy density. The results are established by proving upper and lower bounds that scale the same way. The upper bounds are ansatz-based, providing full details about some (nearly) optimal microstructures. The lower bounds are ansatz-free, so they explain why no other arrangement of the two phases could be significantly better.

preprint2022arXiv

Robust Particle Density Tempering for State Space Models

Density tempering (also called density annealing) is a sequential Monte Carlo approach to Bayesian inference for general state models; it is an alternative to Markov chain Monte Carlo. When applied to state space models, it moves a collection of parameters and latent states (which are called particles) through a number of stages, with each stage having its own target distribution. The particles are initially generated from a distribution that is easy to sample from, e.g. the prior; the target at the final stage is the posterior distribution. Tempering is usually carried out either in batch mode, involving all the data at each stage, or sequentially with observations added at each stage, which is called data tempering. Our paper proposes efficient Markov moves for generating the parameters and states for each stage of particle based density tempering. This allows the proposed SMC methods to increase (scale up) the number of parameters and states that can be handled. Most of the current literature uses a pseudo-marginal Markov move step with the states integrated out, and the parameters generated by a random walk proposal; although this strategy is general, it is very inefficient when the states or parameters are high dimensional. We also build on the work of Dufays (2016) and make data tempering more robust to outliers and structural changes for models with intractable likelihoods by adding batch tempering at each stage. The performance of the proposed methods is evaluated using univariate stochastic volatility models with outliers and structural breaks and high dimensional factor stochastic volatility models having both many parameters and many latent states.

preprint2020arXiv

Gaussian variational approximation for high-dimensional state space models

Our article considers a Gaussian variational approximation of the posterior density in a high-dimensional state space model. The variational parameters to be optimized are the mean vector and the covariance matrix of the approximation. The number of parameters in the covariance matrix grows as the square of the number of model parameters, so it is necessary to find simple yet effective parameterizations of the covariance structure when the number of model parameters is large. We approximate the joint posterior distribution over the high-dimensional state vectors by a dynamic factor model, having Markovian time dependence and a factor covariance structure for the states. This gives a reduced description of the dependence structure for the states, as well as a temporal conditional independence structure similar to that in the true posterior. The usefulness of the approach is illustrated for prediction in two high-dimensional applications that are challenging for Markov chain Monte Carlo sampling. The first is a spatio-temporal model for the spread of the Eurasian Collared-Dove across North America; the second is a Wishart-based multivariate stochastic volatility model for financial returns.

preprint2020arXiv

Identifying relationships between cognitive processes across tasks, contexts, and time

It is commonly assumed that a specific testing occasion (task, design, procedure, etc.) provides insights that generalise beyond that occasion. This assumption is infrequently carefully tested in data. We develop a statistically principled method to directly estimate the correlation between latent components of cognitive processing across tasks, contexts, and time. This method simultaneously estimates individual-participant parameters of a cognitive model at each testing occasion, group-level parameters representing across-participant parameter averages and variances, and across-task correlations. The approach provides a natural way to "borrow" strength across testing occasions, which can increase the precision of parameter estimates across all testing occasions. Two example applications demonstrate that the method is practical in standard designs. The examples, and a simulation study, also provide evidence about the reliability and validity of parameter estimates from the linear ballistic accumulator model. We conclude by highlighting the potential of the parameter-correlation method to provide an "assumption-light" tool for estimating the relatedness of cognitive processes across tasks, contexts, and time.

preprint2020arXiv

New Estimation Approaches for the Hierarchical Linear Ballistic Accumulator Model

The Linear Ballistic Accumulator (Brown & Heathcote, 2008) model is used as a measurement tool to answer questions about applied psychology. The analyses based on this model depend upon the model selected and its estimated parameters. Modern approaches use hierarchical Bayesian models and Markov chain Monte-Carlo (MCMC) methods to estimate the posterior distribution of the parameters. Although there are several approaches available for model selection, they are all based on the posterior samples produced via MCMC, which means that the model selection inference inherits the properties of the MCMC sampler. To improve on current approaches to LBA inference we propose two methods that are based on recent advances in particle MCMC methodology; they are qualitatively different from existing approaches as well as from each other. The first approach is particle Metropolis-within-Gibbs; the second approach is density tempered sequential Monte Carlo. Both new approaches provide very efficient sampling and can be applied to estimate the marginal likelihood, which provides Bayes factors for model selection. The first approach is usually faster. The second approach provides a direct estimate of the marginal likelihood, uses the first approach in its Markov move step and is very efficient to parallelize on high performance computers. The new methods are illustrated by applying them to simulated and real data, and through pseudo code. The code implementing the methods is freely available.

preprint2020arXiv

Spectral Subsampling MCMC for Stationary Time Series

Bayesian inference using Markov Chain Monte Carlo (MCMC) on large datasets has developed rapidly in recent years. However, the underlying methods are generally limited to relatively simple settings where the data have specific forms of independence. We propose a novel technique for speeding up MCMC for time series data by efficient data subsampling in the frequency domain. For several challenging time series models, we demonstrate a speedup of up to two orders of magnitude while incurring negligible bias compared to MCMC on the full dataset. We also propose alternative control variates for variance reduction based on data grouping and coreset constructions.

preprint2020arXiv

Subsampling Sequential Monte Carlo for Static Bayesian Models

We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets.

preprint2020arXiv

The block-Poisson estimator for optimally tuned exact subsampling MCMC

Speeding up Markov Chain Monte Carlo (MCMC) for datasets with many observations by data subsampling has recently received considerable attention. A pseudo-marginal MCMC method is proposed that estimates the likelihood by data subsampling using a block-Poisson estimator. The estimator is a product of Poisson estimators, allowing us to update a single block of subsample indicators in each MCMC iteration so that a desired correlation is achieved between the logs of successive likelihood estimates. This is important since pseudo-marginal MCMC with positively correlated likelihood estimates can use substantially smaller subsamples without adversely affecting the sampling efficiency. The block-Poisson estimator is unbiased but not necessarily positive, so the algorithm runs the MCMC on the absolute value of the likelihood estimator and uses an importance sampling correction to obtain consistent estimates of the posterior mean of any function of the parameters. Our article derives guidelines to select the optimal tuning parameters for our method and shows that it compares very favourably to regular MCMC without subsampling, and to two other recently proposed exact subsampling approaches in the literature.

preprint2020arXiv

The Interaction Between Credit Constraints and Uncertainty Shocks

Can uncertainty about credit availability trigger a slowdown in real activity? This question is answered by using a novel method to identify shocks to uncertainty in access to credit. Time-variation in uncertainty about credit availability is estimated using particle Markov Chain Monte Carlo. We extract shocks to time-varying credit uncertainty and decompose it into two parts: the first captures the "pure" effect of a shock to the second moment; the second captures total effects of uncertainty including effects on the first moment. Using state-dependent local projections, we find that the "pure" effect by itself generates a sharp slowdown in real activity and the effects are largely countercyclical. We feed the estimated shocks into a flexible price real business cycle model with a collateral constraint and show that when the collateral constraint binds, an uncertainty shock about credit access is recessionary leading to a simultaneous decline in consumption, investment, and output.

preprint2019arXiv

Efficient data augmentation for multivariate probit models with panel data: An application to general practitioner decision-making about contraceptives

This article considers the problem of estimating a multivariate probit model in a panel data setting with emphasis on sampling a high-dimensional correlation matrix and improving the overall efficiency of the data augmentation approach. We reparameterise the correlation matrix in a principled way and then carry out efficient Bayesian inference using Hamiltonian Monte Carlo. We also propose a novel antithetic variable method to generate samples from the posterior distribution of the random effects and regression coefficients, resulting in significant gains in efficiency. We apply the methodology by analysing stated preference data obtained from Australian general practitioners evaluating alternative contraceptive products. Our analysis suggests that the joint probability of discussing combinations of contraceptive products with a patient shows medical practice variation among the general practitioners, which indicates some resistance to even discuss these products, let alone recommend them.

preprint2010arXiv

A copula based approach to adaptive sampling

Our article is concerned with adaptive sampling schemes for Bayesian inference that update the proposal densities using previous iterates. We introduce a copula based proposal density which is made more efficient by combining it with antithetic variable sampling. We compare the copula based proposal to an adaptive proposal density based on a multivariate mixture of normals and an adaptive random walk Metropolis proposal. We also introduce a refinement of the random walk proposal which performs better for multimodal target distributions. We compare the sampling schemes using challenging but realistic models and priors applied to real data examples. The results show that for the examples studied, the adaptive independent \MH{} proposals are much more efficient than the adaptive random walk proposals and that in general the copula based proposal has the best acceptance rates and lowest inefficiencies.

preprint2010arXiv

Auxiliary Particle filtering within adaptive Metropolis-Hastings Sampling

Our article deals with Bayesian inference for a general state space model with the simulated likelihood computed by the particle filter. We show empirically that the partially or fully adapted particle filters can be much more efficient than the standard particle, especially when the signal to noise ratio is high. This is especially important because using the particle filter within MCMC sampling is O(T^2), where T is the sample size. We also show that an adaptive independent proposal for the unknown parameters based on a mixture of normals can be much more efficient than the usual optimal random walk methods because the simulated likelihood is not continuous in the parameters and the cost of constructing a good adaptive proposal is negligible compared to the cost of evaluating the simulated likelihood. Independent \MH proposals are also attractive because they are easy to run in parallel on multiple processors. The article also shows that the proposed \aimh sampler converges to the posterior distribution. We also show that the marginal likelihood of any state space model can be obtained in an efficient and unbiased manner by using the \pf making model comparison straightforward. Obtaining the marginal likelihood is often difficult using other methods. Finally, we prove that the simulated likelihood obtained by the auxiliary particle filter is unbiased. This result is fundamental to using the particle for MCMC sampling and is first obtained in a more abstract and difficult setting by Del Moral (2004). However, our proof is direct and will make the result accessible to readers.

preprint2010arXiv

Computationally Efficient Estimation of Factor Multivariate Stochastic Volatility Models

An MCMC simulation method based on a two stage delayed rejection Metropolis-Hastings algorithm is proposed to estimate a factor multivariate stochastic volatility model. The first stage uses kstep iteration towards the mode, with k small, and the second stage uses an adaptive random walk proposal density. The marginal likelihood approach of Chib (1995) is used to choose the number of factors, with the posterior density ordinates approximated by Gaussian copula. Simulation and real data applications suggest that the proposed simulation method is computationally much more efficient than the approach of Chib. Nardari and Shephard (2006}. This increase in computational efficiency is particularly important in calculating marginal likelihoods because it is necessary to carry out the simulation a number of times to estimate the posterior ordinates for a given marginal likelihood. In addition to the MCMC method, the paper also proposes a fast approximate EM method to estimate the factor multivariate stochastic volatility model. The estimates from the approximate EM method are of interest in their own right, but are especially useful as initial inputs to MCMC methods, making them more efficient computationally. The methodology is illustrated using simulated and real examples.