Researcher profile

David C. Woods

David C. Woods contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
8works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2021arXiv

Summary of effect aliasing structure (SEAS): new descriptive statistics for factorial and supersaturated designs

In the assessment and selection of supersaturated designs, the aliasing structure of interaction effects is usually ignored by traditional criteria such as $E(s^2)$-optimality. We introduce the Summary of Effect Aliasing Structure (SEAS) for assessing the aliasing structure of supersaturated designs, and other non-regular fractional factorial designs, that takes account of interaction terms and provides more detail than usual summaries such as (generalized) resolution and wordlength patterns. The new summary consists of three criteria, abbreviated as MAP: (1) Maximum dependency aliasing pattern; (2) Average square aliasing pattern; and (3) Pairwise dependency ratio. These criteria provided insight when traditional criteria fail to differentiate between designs. We theoretically study the relationship between the MAP criteria and traditional quantities, and demonstrate the use of SEAS for comparing some example supersaturated designs, including designs suggested in the literature. We also propose a variant of SEAS to measure the aliasing structure for individual columns of a design, and use it to choose assignments of factors to columns for an $E(s^2)$-optimal design.

preprint2020arXiv

Nonmyopic and pseudo-nonmyopic approaches to optimal sequential design in the presence of covariates

In sequential experiments, subjects become available for the study over a period of time, and covariates are often measured at the time of arrival. We consider the setting where the sample size is fixed but covariate values are unknown until subjects enrol. Given a model for the outcome, a sequential optimal design approach can be used to allocate treatments to minimize the variance of the treatment effect. We extend existing optimal design methodology so it can be used within a nonmyopic framework, where treatment allocation for the current subject depends not only on the treatments and covariates of the subjects already enrolled in the study, but also the impact of possible future treatment assignments. The nonmyopic approach is computationally expensive as it requires recursive formulae. We propose a pseudo-nonmyopic approach which has a similar aim to the nonmyopic approach, but does not involve recursion and instead relies on simulations of future possible decisions. Our simulation studies show that the myopic approach is the most efficient for the logistic model case with a single binary covariate and binary treatment.

preprint2016arXiv

Bayesian design of experiments for generalised linear models and dimensional analysis with industrial and scientific application

The design of an experiment can be always be considered at least implicitly Bayesian, with prior knowledge used informally to aid decisions such as the variables to be studied and the choice of a plausible relationship between the explanatory variables and measured responses. Bayesian methods allow uncertainty in these decisions to be incorporated into design selection through prior distributions that encapsulate information available from scientific knowledge or previous experimentation. Further, a design may be explicitly tailored to the aim of the experiment through a decision-theoretic approach using an appropriate loss function. We review the area of decision-theoretic Bayesian design, with particular emphasis on recent advances in computational methods. For many problems arising in industry and science, experiments result in a discrete response that is well described by a member of the class of generalised linear models. We describe how Gaussian process emulation, commonly used in computer experiments, can play an important role in facilitating Bayesian design for realistic problems. A main focus is the combination of Gaussian process regression to approximate the expected loss with cyclic descent (coordinate exchange) optimisation algorithms to allow optimal designs to be found for previously infeasible problems. We also present the first optimal design results for statistical models formed from dimensional analysis, a methodology widely employed in the engineering and physical sciences to produce parsimonious and interpretable models. Using the famous paper helicopter experiment, we show the potential for the combination of Bayesian design, generalised linear models and dimensional analysis to produce small but informative experiments.

preprint2016arXiv

Emulation of multivariate simulators using thin-plate splines with application to atmospheric dispersion

It is often desirable to build a statistical emulator of a complex computer simulator in order to perform analysis which would otherwise be computationally infeasible. We propose methodology to model multivariate output from a computer simulator taking into account output structure in the responses. The utility of this approach is demonstrated by applying it to a chemical and biological hazard prediction model. Predicting the hazard area that results from an accidental or deliberate chemical or biological release is imperative in civil and military planning and also in emergency response. The hazard area resulting from such a release is highly structured in space and we therefore propose the use of a thin-plate spline to capture the spatial structure and fit a Gaussian process emulator to the coefficients of the resultant basis functions. We compare and contrast four different techniques for emulating multivariate output: dimension-reduction using (i) a fully Bayesian approach with a principal component basis, (ii) a fully Bayesian approach with a thin-plate spline basis, assuming that the basis coefficients are independent, and (iii) a "plug-in" Bayesian approach with a thin-plate spline basis and a separable covariance structure; and (iv) a functional data modeling approach using a tensor-product (separable) Gaussian process. We develop methodology for the two thin-plate spline emulators and demonstrate that these emulators significantly outperform the principal component emulator. Further, the separable thin-plate spline emulator, which accounts for the dependence between basis coefficients, provides substantially more realistic quantification of uncertainty, and is also computationally more tractable, allowing fast emulation. For high resolution output data, it also offers substantial predictive and computational advantages over the tensor-product Gaussian process emulator.

preprint2016arXiv

Model selection via Bayesian information capacity designs for generalised linear models

The first investigation is made of designs for screening experiments where the response variable is approximated by a generalised linear model. A Bayesian information capacity criterion is defined for the selection of designs that are robust to the form of the linear predictor. For binomial data and logistic regression, the effectiveness of these designs for screening is assessed through simulation studies using all-subsets regression and model selection via maximum penalised likelihood and a generalised information criterion. For Poisson data and log-linear regression, similar assessments are made using maximum likelihood and the Akaike information criterion for minimally-supported designs that are constructed analytically. The results show that effective screening, that is, high power with moderate type I error rate and false discovery rate, can be achieved through suitable choices for the number of design support points and experiment size. Logistic regression is shown to present a more challenging problem than log-linear regression. Some areas for future work are also indicated.

preprint2015arXiv

Design of Experiments for Screening

The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared.

preprint2014arXiv

Designs for generalized linear models with random block effects via information matrix approximations

The selection of optimal designs for generalized linear mixed models is complicated by the fact that the Fisher information matrix, on which most optimality criteria depend, is computationally expensive to evaluate. Our focus is on the design of experiments for likelihood estimation of parameters in the conditional model. We provide two novel approximations that substantially reduce the computational cost of evaluating the information matrix by complete enumeration of response outcomes, or Monte Carlo approximations thereof: (i) an asymptotic approximation which is accurate when there is strong dependence between observations in the same block; (ii) an approximation via Kriging interpolators. For logistic random intercept models, we show how interpolation can be especially effective for finding pseudo-Bayesian designs that incorporate uncertainty in the values of the model parameters. The new results are used to provide the first evaluation of the efficiency, for estimating conditional models, of optimal designs from closed-form approximations to the information matrix derived from marginal models. It is found that correcting for the marginal attenuation of parameters in binary-response models yields much improved designs, typically with very high efficiencies. However, in some experiments exhibiting strong dependence, designs for marginal models may still be inefficient for conditional modelling. Our asymptotic results provide some theoretical insights into why such inefficiencies occur.