Source author record

Gregory F. Cooper

Gregory F. Cooper appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Methodology Applications Computational Engineering, Finance, and Science Molecular Networks Neural and Evolutionary Computing

Catalog footprint

What is connected

24works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Outlier detection for patient monitoring and alerting

We develop and evaluate a data-driven approach for detecting unusual (anomalous) patient-management decisions using past patient cases stored in electronic health records (EHRs). Our hypothesis is that a patient-management decision that is unusual with respect to past patient care may be due to an error and that it is worthwhile to generate an alert if such a decision is encountered. We evaluate this hypothesis using data obtained from EHRs of 4486 post-cardiac surgical patients and a subset of 222 alerts generated from the data. We base the evaluation on the opinions of a panel of experts. The results of the study support our hypothesis that the outlier-based alerting can lead to promising true alert rates. We observed true alert rates that ranged from 25\% to 66\% for a variety of patient-management actions, with 66\% corresponding to the strongest outliers.

preprint2022arXiv

The m-connecting imset and factorization for ADMG models

Directed acyclic graph (DAG) models have become widely studied and applied in statistics and machine learning -- indeed, their simplicity facilitates efficient procedures for learning and inference. Unfortunately, these models are not closed under marginalization, making them poorly equipped to handle systems with latent confounding. Acyclic directed mixed graph (ADMG) models characterize margins of DAG models, making them far better suited to handle such systems. However, ADMG models have not seen wide-spread use due to their complexity and a shortage of statistical tools for their analysis. In this paper, we introduce the m-connecting imset which provides an alternative representation for the independence models induced by ADMGs. Furthermore, we define the m-connecting factorization criterion for ADMG models, characterized by a single equation, and prove its equivalence to the global Markov property. The m-connecting imset and factorization criterion provide two new statistical tools for learning and inference with ADMG models. We demonstrate the usefulness of these tools by formulating and evaluating a consistent scoring criterion with a closed form solution.

preprint2020arXiv

Learning Latent Causal Structures with a Redundant Input Neural Network

Most causal discovery algorithms find causal structure among a set of observed variables. Learning the causal structure among latent variables remains an important open problem, particularly when using high-dimensional data. In this paper, we address a problem for which it is known that inputs cause outputs, and these causal relationships are encoded by a causal network among a set of an unknown number of latent variables. We developed a deep learning model, which we call a redundant input neural network (RINN), with a modified architecture and a regularized objective function to find causal relationships between input, hidden, and output variables. More specifically, our model allows input variables to directly interact with all latent variables in a neural network to influence what information the latent variables should encode in order to generate the output variables accurately. In this setting, the direct connections between input and latent variables makes the latent variables partially interpretable; furthermore, the connectivity among the latent variables in the neural network serves to model their potential causal relationships to each other and to the output variables. A series of simulation experiments provide support that the RINN method can successfully recover latent causal structure between input and output variables.

preprint2015arXiv

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models

Learning accurate probabilistic models from data is crucial in many practical tasks in data mining. In this paper we present a new non-parametric calibration method called \textit{ensemble of near isotonic regression} (ENIR). The method can be considered as an extension of BBQ, a recently proposed calibration method, as well as the commonly used calibration method based on isotonic regression. ENIR is designed to address the key limitation of isotonic regression which is the monotonicity assumption of the predictions. Similar to BBQ, the method post-processes the output of a binary classifier to obtain calibrated probabilities. Thus it can be combined with many existing classification models. We demonstrate the performance of ENIR on synthetic and real datasets for the commonly used binary classification models. Experimental results show that the method outperforms several common binary classifier calibration methods. In particular on the real data, ENIR commonly performs statistically significantly better than the other methods, and never worse. It is able to improve the calibration power of classifiers, while retaining their discrimination power. The method is also computationally tractable for large scale datasets, as it is $O(N \log N)$ time, where $N$ is the number of samples.

preprint2014arXiv

Binary Classifier Calibration: Bayesian Non-Parametric Approach

A set of probabilistic predictions is well calibrated if the events that are predicted to occur with probability p do in fact occur about p fraction of the time. Well calibrated predictions are particularly important when machine learning models are used in decision analysis. This paper presents two new non-parametric methods for calibrating outputs of binary classification models: a method based on the Bayes optimal selection and a method based on the Bayesian model averaging. The advantage of these methods is that they are independent of the algorithm used to learn a predictive model, and they can be applied in a post-processing step, after the model is learned. This makes them applicable to a wide variety of machine learning models and methods. These calibration methods, as well as other methods, are tested on a variety of datasets in terms of both discrimination and calibration performance. The results show the methods either outperform or are comparable in performance to the state-of-the-art calibration methods.

preprint2014arXiv

Binary Classifier Calibration: Non-parametric approach

Accurate calibration of probabilistic predictive models learned is critical for many practical prediction and decision-making tasks. There are two main categories of methods for building calibrated classifiers. One approach is to develop methods for learning probabilistic models that are well-calibrated, ab initio. The other approach is to use some post-processing methods for transforming the output of a classifier to be well calibrated, as for example histogram binning, Platt scaling, and isotonic regression. One advantage of the post-processing approach is that it can be applied to any existing probabilistic classification model that was constructed using any machine-learning method. In this paper, we first introduce two measures for evaluating how well a classifier is calibrated. We prove three theorems showing that using a simple histogram binning post-processing method, it is possible to make a classifier be well calibrated while retaining its discrimination capability. Also, by casting the histogram binning method as a density-based non-parametric binary classifier, we can extend it using two simple non-parametric density estimation methods. We demonstrate the performance of the proposed calibration methods on synthetic and real datasets. Experimental results show that the proposed methods either outperform or are comparable to existing calibration methods.

preprint2014arXiv

Counting Markov Blanket Structures

Learning Markov blanket (MB) structures has proven useful in performing feature selection, learning Bayesian networks (BNs), and discovering causal relationships. We present a formula for efficiently determining the number of MB structures given a target variable and a set of other variables. As expected, the number of MB structures grows exponentially. However, we show quantitatively that there are many fewer MB structures that contain the target variable than there are BN structures that contain it. In particular, the ratio of BN structures to MB structures appears to increase exponentially in the number of variables.

preprint2013arXiv

A Bayesian Method for Causal Modeling and Discovery Under Selection

This paper describes a Bayesian method for learning causal networks using samples that were selected in a non-random manner from a population of interest. Examples of data obtained by non-random sampling include convenience samples and case-control data in which a fixed number of samples with and without some condition is collected; such data are not uncommon. The paper describes a method for combining data under selection with prior beliefs in order to derive a posterior probability for a model of the causal processes that are generating the data in the population of interest. The priors include beliefs about the nature of the non-random sampling procedure. Although exact application of the method would be computationally intractable for most realistic datasets, efficient special-case and approximation methods are discussed. Finally, the paper describes how to combine learning under selection with previous methods for learning from observational and experimental data that are obtained on random samples of the population of interest. The net result is a Bayesian methodology that supports causal modeling and discovery from a rich mixture of different types of data.

preprint2013arXiv

A Bayesian Method for Constructing Bayesian Belief Networks from Databases

This paper presents a Bayesian method for constructing Bayesian belief networks from a database of cases. Potential applications include computer-assisted hypothesis testing, automated scientific discovery, and automated construction of probabilistic expert systems. Results are presented of a preliminary evaluation of an algorithm for constructing a belief network from a database of cases. We relate the methods in this paper to previous work, and we discuss open problems.

preprint2013arXiv

A Bayesian Network Classifier that Combines a Finite Mixture Model and a Naive Bayes Model

In this paper we present a new Bayesian network model for classification that combines the naive-Bayes (NB) classifier and the finite-mixture (FM) classifier. The resulting classifier aims at relaxing the strong assumptions on which the two component models are based, in an attempt to improve on their classification performance, both in terms of accuracy and in terms of calibration of the estimated probabilities. The proposed classifier is obtained by superimposing a finite mixture model on the set of feature variables of a naive Bayes model. We present experimental results that compare the predictive performance on real datasets of the new classifier with the predictive performance of the NB classifier and the FM classifier.

preprint2013arXiv

A Method for Using Belief Networks as Influence Diagrams

This paper demonstrates a method for using belief-network algorithms to solve influence diagram problems. In particular, both exact and approximation belief-network algorithms may be applied to solve influence-diagram problems. More generally, knowing the relationship between belief-network and influence-diagram problems may be useful in the design and development of more efficient influence diagram algorithms.

preprint2013arXiv

A Multivariate Discretization Method for Learning Bayesian Networks from Mixed Data

In this paper we address the problem of discretization in the context of learning Bayesian networks (BNs) from data containing both continuous and discrete variables. We describe a new technique for <EM>multivariate</EM> discretization, whereby each continuous variable is discretized while taking into account its interaction with the other variables. The technique is based on the use of a Bayesian scoring metric that scores the discretization policy for a continuous variable given a BN structure and the observed data. Since the metric is relative to the BN structure currently being evaluated, the discretization of a variable needs to be dynamically adjusted as the BN structure changes.

preprint2013arXiv

A Structurally and Temporally Extended Bayesian Belief Network Model: Definitions, Properties, and Modeling Techniques

We developed the language of Modifiable Temporal Belief Networks (MTBNs) as a structural and temporal extension of Bayesian Belief Networks (BNs) to facilitate normative temporal and causal modeling under uncertainty. In this paper we present definitions of the model, its components, and its fundamental properties. We also discuss how to represent various types of temporal knowledge, with an emphasis on hybrid temporal-explicit time modeling, dynamic structures, avoiding causal temporal inconsistencies, and dealing with models that involve simultaneously actions (decisions) and causal and non-causal associations. We examine the relationships among BNs, Modifiable Belief Networks, and MTBNs with a single temporal granularity, and suggest areas of application suitable to each one of them.

preprint2013arXiv

An Algorithm for Computing Probabilistic Propositions

A method for computing probabilistic propositions is presented. It assumes the availability of a single external routine for computing the probability of one instantiated variable, given a conjunction of other instantiated variables. In particular, the method allows belief network algorithms to calculate general probabilistic propositions over nodes in the network. Although in the worst case the time complexity of the method is exponential in the size of a query, it is polynomial in the size of a number of common types of queries.

preprint2013arXiv

An Empirical Evaluation of a Randomized Algorithm for Probabilistic Inference

In recent years, researchers in decision analysis and artificial intelligence (Al) have used Bayesian belief networks to build models of expert opinion. Using standard methods drawn from the theory of computational complexity, workers in the field have shown that the problem of probabilistic inference in belief networks is difficult and almost certainly intractable. K N ET, a software environment for constructing knowledge-based systems within the axiomatic framework of decision theory, contains a randomized approximation scheme for probabilistic inference. The algorithm can, in many circumstances, perform efficient approximate inference in large and richly interconnected models of medical diagnosis. Unlike previously described stochastic algorithms for probabilistic inference, the randomized approximation scheme computes a priori bounds on running time by analyzing the structure and contents of the belief network. In this article, we describe a randomized algorithm for probabilistic inference and analyze its performance mathematically. Then, we devote the major portion of the paper to a discussion of the algorithm's empirical behavior. The results indicate that the generation of good trials (that is, trials whose distribution closely matches the true distribution), rather than the computation of numerous mediocre trials, dominates the performance of stochastic simulation. Key words: probabilistic inference, belief networks, stochastic simulation, computational complexity theory, randomized algorithms.

preprint2013arXiv

An Evaluation of an Algorithm for Inductive Learning of Bayesian Belief Networks Usin

Bayesian learning of belief networks (BLN) is a method for automatically constructing belief networks (BNs) from data using search and Bayesian scoring techniques. K2 is a particular instantiation of the method that implements a greedy search strategy. To evaluate the accuracy of K2, we randomly generated a number of BNs and for each of those we simulated data sets. K2 was then used to induce the generating BNs from the simulated data. We examine the performance of the program, and the factors that influence it. We also present a simple BN model, developed from our results, which predicts the accuracy of K2, when given various characteristics of the data set.

preprint2013arXiv

Bounded Conditioning: Flexible Inference for Decisions under Scarce Resources

We introduce a graceful approach to probabilistic inference called bounded conditioning. Bounded conditioning monotonically refines the bounds on posterior probabilities in a belief network with computation, and converges on final probabilities of interest with the allocation of a complete resource fraction. The approach allows a reasoner to exchange arbitrary quantities of computational resource for incremental gains in inference quality. As such, bounded conditioning holds promise as a useful inference technique for reasoning under the general conditions of uncertain and varying reasoning resources. The algorithm solves a probabilistic bounding problem in complex belief networks by breaking the problem into a set of mutually exclusive, tractable subproblems and ordering their solution by the expected effect that each subproblem will have on the final answer. We introduce the algorithm, discuss its characterization, and present its performance on several belief networks, including a complex model for reasoning about problems in intensive-care medicine.

preprint2013arXiv

Causal Discovery from a Mixture of Experimental and Observational Data

This paper describes a Bayesian method for combining an arbitrary mixture of observational and experimental data in order to learn causal Bayesian networks. Observational data are passively observed. Experimental data, such as that produced by randomized controlled trials, result from the experimenter manipulating one or more variables (typically randomly) and observing the states of other variables. The paper presents a Bayesian method for learning the causal structure and parameters of the underlying causal process that is generating the data, given that (1) the data contains a mixture of observational and experimental case records, and (2) the causal process is modeled as a causal Bayesian network. This learning method was applied using as input various mixtures of experimental and observational data that were generated from the ALARM causal Bayesian network. In these experiments, the absolute and relative quantities of experimental and observational data were varied systematically. For each of these training datasets, the learning method was applied to predict the causal structure and to estimate the causal parameters that exist among randomly selected pairs of nodes in ALARM that are not confounded. The paper reports how these structure predictions and parameter estimates compare with the true causal structures and parameters as given by the ALARM network.

preprint2013arXiv

KNET: Integrating Hypermedia and Bayesian Modeling

KNET is a general-purpose shell for constructing expert systems based on belief networks and decision networks. Such networks serve as graphical representations for decision models, in which the knowledge engineer must define clearly the alternatives, states, preferences, and relationships that constitute a decision basis. KNET contains a knowledge-engineering core written in Object Pascal and an interface that tightly integrates HyperCard, a hypertext authoring tool for the Apple Macintosh computer, into a novel expert-system architecture. Hypertext and hypermedia have become increasingly important in the storage management, and retrieval of information. In broad terms, hypermedia deliver heterogeneous bits of information in dynamic, extensively cross-referenced packages. The resulting KNET system features a coherent probabilistic scheme for managing uncertainty, an objectoriented graphics editor for drawing and manipulating decision networks, and HyperCard's potential for quickly constructing flexible and friendly user interfaces. We envision KNET as a useful prototyping tool for our ongoing research on a variety of Bayesian reasoning problems, including tractable representation, inference, and explanation.

preprint2013arXiv

Stochastic Simulation of Bayesian Belief Networks

This paper examines Bayesian belief network inference using simulation as a method for computing the posterior probabilities of network variables. Specifically, it examines the use of a method described by Henrion, called logic sampling, and a method described by Pearl, called stochastic simulation. We first review the conditions under which logic sampling is computationally infeasible. Such cases motivated the development of the Pearl's stochastic simulation algorithm. We have found that this stochastic simulation algorithm, when applied to certain networks, leads to much slower than expected convergence to the true posterior probabilities. This behavior is a result of the tendency for local areas in the network to become fixed through many simulation cycles. The time required to obtain significant convergence can be made arbitrarily long by strengthening the probabilistic dependency between nodes. We propose the use of several forms of graph modification, such as graph pruning, arc reversal, and node reduction, in order to convert some networks into formats that are computationally more efficient for simulation.

preprint2013arXiv

Updating Probabilities in Multiply-Connected Belief Networks

This paper focuses on probability updates in multiply-connected belief networks. Pearl has designed the method of conditioning, which enables us to apply his algorithm for belief updates in singly-connected networks to multiply-connected belief networks by selecting a loop-cutset for the network and instantiating these loop-cutset nodes. We discuss conditions that need to be satisfied by the selected nodes. We present a heuristic algorithm for finding a loop-cutset that satisfies these conditions.

preprint2012arXiv

A Bayesian Network Scoring Metric That Is Based On Globally Uniform Parameter Priors

We introduce a new Bayesian network (BN) scoring metric called the Global Uniform (GU) metric. This metric is based on a particular type of default parameter prior. Such priors may be useful when a BN developer is not willing or able to specify domain-specific parameter priors. The GU parameter prior specifies that every prior joint probability distribution P consistent with a BN structure S is considered to be equally likely. Distribution P is consistent with S if P includes just the set of independence relations defined by S. We show that the GU metric addresses some undesirable behavior of the BDeu and K2 Bayesian network scoring metrics, which also use particular forms of default parameter priors. A closed form formula for computing GU for special classes of BNs is derived. Efficiently computing GU for an arbitrary BN remains an open problem.

preprint2012arXiv

A theoretical study of Y structures for causal discovery

There are several existing algorithms that under appropriate assumptions can reliably identify a subset of the underlying causal relationships from observational data. This paper introduces the first computationally feasible score-based algorithm that can reliably identify causal relationships in the large sample limit for discrete models, while allowing for the possibility that there are unobserved common causes. In doing so, the algorithm does not ever need to assign scores to causal structures with unobserved common causes. The algorithm is based on the identification of so called Y substructures within Bayesian network structures that can be learned from observational data. An example of a Y substructure is A -> C, B -> C, C -> D. After providing background on causal discovery, the paper proves the conditions under which the algorithm is reliable in the large sample limit.

preprint2012arXiv

Bayesian Biosurveillance of Disease Outbreaks

Early, reliable detection of disease outbreaks is a critical problem today. This paper reports an investigation of the use of causal Bayesian networks to model spatio-temporal patterns of a non-contagious disease (respiratory anthrax infection) in a population of people. The number of parameters in such a network can become enormous, if not carefully managed. Also, inference needs to be performed in real time as population data stream in. We describe techniques we have applied to address both the modeling and inference challenges. A key contribution of this paper is the explication of assumptions and techniques that are sufficient to allow the scaling of Bayesian network modeling and inference to millions of nodes for real-time surveillance applications. The results reported here provide a proof-of-concept that Bayesian networks can serve as the foundation of a system that effectively performs Bayesian biosurveillance of disease outbreaks.

Gregory F. Cooper

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

Outlier detection for patient monitoring and alerting

The m-connecting imset and factorization for ADMG models

Learning Latent Causal Structures with a Redundant Input Neural Network

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models

Binary Classifier Calibration: Bayesian Non-Parametric Approach

Binary Classifier Calibration: Non-parametric approach

Counting Markov Blanket Structures

A Bayesian Method for Causal Modeling and Discovery Under Selection

A Bayesian Method for Constructing Bayesian Belief Networks from Databases

A Bayesian Network Classifier that Combines a Finite Mixture Model and a Naive Bayes Model

A Method for Using Belief Networks as Influence Diagrams

A Multivariate Discretization Method for Learning Bayesian Networks from Mixed Data

A Structurally and Temporally Extended Bayesian Belief Network Model: Definitions, Properties, and Modeling Techniques

An Algorithm for Computing Probabilistic Propositions

An Empirical Evaluation of a Randomized Algorithm for Probabilistic Inference

An Evaluation of an Algorithm for Inductive Learning of Bayesian Belief Networks Usin

Bounded Conditioning: Flexible Inference for Decisions under Scarce Resources

Causal Discovery from a Mixture of Experimental and Observational Data

KNET: Integrating Hypermedia and Bayesian Modeling

Stochastic Simulation of Bayesian Belief Networks

Updating Probabilities in Multiply-Connected Belief Networks

A Bayesian Network Scoring Metric That Is Based On Globally Uniform Parameter Priors

A theoretical study of Y structures for causal discovery

Bayesian Biosurveillance of Disease Outbreaks