Source author record

Dan Geiger

Dan Geiger appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Data Structures and Algorithms Applications Computation Computer Vision Genomics Logic in Computer Science math.CO math.PR

Catalog footprint

What is connected

25works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Accurate Liability Estimation Improves Power in Ascertained Case Control Studies

Linear mixed models (LMMs) have emerged as the method of choice for confounded genome-wide association studies. However, the performance of LMMs in non-randomly ascertained case-control studies deteriorates with increasing sample size. We propose a framework called LEAP (Liability Estimator As a Phenotype, https://github.com/omerwe/LEAP) that tests for association with estimated latent values corresponding to severity of phenotype, and demonstrate that this can lead to a substantial power increase.

preprint2016arXiv

Dependence and Relevance: A probabilistic view

We examine three probabilistic concepts related to the sentence "two variables have no bearing on each other". We explore the relationships between these three concepts and establish their relevance to the process of constructing similarity networks---a tool for acquiring probabilistic knowledge from human experts. We also establish a precise relationship between connectedness in Bayesian networks and relevance in probability.

preprint2015arXiv

Advances in Probabilistic Reasoning

This paper discuses multiple Bayesian networks representation paradigms for encoding asymmetric independence assertions. We offer three contributions: (1) an inference mechanism that makes explicit use of asymmetric independence to speed up computations, (2) a simplified definition of similarity networks and extensions of their theory, and (3) a generalized representation scheme that encodes more types of asymmetric independence assertions than do similarity networks.

preprint2015arXiv

Asymptotic Model Selection for Directed Networks with Hidden Variables

We extend the Bayesian Information Criterion (BIC), an asymptotic approximation for the marginal likelihood, to Bayesian networks with hidden variables. This approximation can be used to select models given large samples of data. The standard BIC as well as our extension punishes the complexity of a model according to the dimension of its parameters. We argue that the dimension of a Bayesian network with hidden variables is the rank of the Jacobian matrix of the transformation between the parameters of the network and the parameters of the observable variables. We compute the dimensions of several networks including the naive Bayes model with a hidden root node.

preprint2015arXiv

Inference Algorithms for Similarity Networks

We examine two types of similarity networks each based on a distinct notion of relevance. For both types of similarity networks we present an efficient inference algorithm that works under the assumption that every event has a nonzero probability of occurrence. Another inference algorithm is developed for type 1 similarity networks that works under no restriction, albeit less efficiently.

preprint2015arXiv

Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Our contributions are threefold. First, we identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simplify the encoding of a user's prior knowledge. In particular, a user can express her knowledge-for the most part-as a single prior Bayesian network for the domain. Second, we describe local search and annealing algorithms to be used in conjunction with scoring metrics. In the special case where each node has at most one parent, we show that heuristic search can be replaced with a polynomial algorithm to identify the networks with the highest score. Third, we describe a methodology for evaluating Bayesian-network learning algorithms. We apply this approach to a comparison of metrics and search procedures.

preprint2015arXiv

Separable and transitive graphoids

We examine three probabilistic formulations of the sentence a and b are totally unrelated with respect to a given set of variables U. First, two variables a and b are totally independent if they are independent given any value of any subset of the variables in U. Second, two variables are totally uncoupled if U can be partitioned into two marginally independent sets containing a and b respectively. Third, two variables are totally disconnected if the corresponding nodes are disconnected in every belief network representation. We explore the relationship between these three formulations of unrelatedness and explain their relevance to the process of acquiring probabilistic knowledge from human experts.

preprint2014arXiv

Random Algorithms for the Loop Cutset Problem

We show how to find a minimum loop cutset in a Bayesian network with high probability. Finding such a loop cutset is the first step in Pearl's method of conditioning for inference. Our random algorithm for finding a loop cutset, called "Repeated WGuessI", outputs a minimum loop cutset, after O(c 6^k k n) steps, with probability at least 1-(1 over{6^k})^{c 6^k}), where c>1 is a constant specified by the user, k is the size of a minimum weight loop cutset, and n is the number of vertices. We also show empirically that a variant of this algorithm, called WRA, often finds a loop cutset that is closer to the minimum loop cutset than the ones found by the best deterministic algorithms known.

preprint2013arXiv

A Characterization of the Dirichlet Distribution with Application to Learning Bayesian Networks

We provide a new characterization of the Dirichlet distribution. This characterization implies that under assumptions made by several previous authors for learning belief networks, a Dirichlet prior on the parameters is inevitable.

preprint2013arXiv

A Sufficiently Fast Algorithm for Finding Close to Optimal Junction Trees

An algorithm is developed for finding a close to optimal junction tree of a given graph G. The algorithm has a worst case complexity O(c^k n^a) where a and c are constants, n is the number of vertices, and k is the size of the largest clique in a junction tree of G in which this size is minimized. The algorithm guarantees that the logarithm of the size of the state space of the heaviest clique in the junction tree produced is less than a constant factor off the optimal value. When k = O(log n), our algorithm yields a polynomial inference algorithm for Bayesian networks.

preprint2013arXiv

An Entropy-based Learning Algorithm of Bayesian Conditional Trees

This article offers a modification of Chow and Liu's learning algorithm in the context of handwritten digit recognition. The modified algorithm directs the user to group digits into several classes consisting of digits that are hard to distinguish and then constructing an optimal conditional tree representation for each class of digits instead of for each single digit as done by Chow and Liu (1968). Advantages and extensions of the new method are discussed. Related works of Wong and Wang (1977) and Wong and Poon (1989) which offer a different entropy-based learning algorithm are shown to rest on inappropriate assumptions.

preprint2013arXiv

Approximation Algorithms for the Loop Cutset Problem

We show how to find a small loop curser in a Bayesian network. Finding such a loop cutset is the first step in the method of conditioning for inference. Our algorithm for finding a loop cutset, called MGA, finds a loop cutset which is guaranteed in the worst case to contain less than twice the number of variables contained in a minimum loop cutset. We test MGA on randomly generated graphs and find that the average ratio between the number of instances associated with the algorithms' output and the number of instances associated with a minimum solution is 1.22.

preprint2013arXiv

d-Separation: From Theorems to Algorithms

An efficient algorithm is developed that identifies all independencies implied by the topology of a Bayesian network. Its correctness and maximality stems from the soundness and completeness of d-separation with respect to probability theory. The algorithm runs in time O (l E l) where E is the number of edges in the network.

preprint2013arXiv

Graphical Models and Exponential Families

We provide a classification of graphical models according to their representation as subfamilies of exponential families. Undirected graphical models with no hidden variables are linear exponential families (LEFs), directed acyclic graphical models and chain graphs with no hidden variables, including Bayesian networks with several families of local distributions, are curved exponential families (CEFs) and graphical models with hidden variables are stratified exponential families (SEFs). An SEF is a finite union of CEFs satisfying a frontier condition. In addition, we illustrate how one can automatically generate independence and non-independence constraints on the distributions over the observable variables implied by a Bayesian network with hidden variables. The relevance of these results for model selection is examined.

preprint2013arXiv

Likelihood Computations Using Value Abstractions

In this paper, we use evidence-specific value abstraction for speeding Bayesian networks inference. This is done by grouping variable values and treating the combined values as a single entity. As we show, such abstractions can exploit regularities in conditional probability distributions and also the specific values of observed variables. To formally justify value abstraction, we define the notion of safe value abstraction and devise inference algorithms that use it to reduce the cost of inference. Our procedure is particularly useful for learning complex networks with many hidden variables. In such cases, repeated likelihood computations are required for EM or other parameter optimization techniques. Since these computations are repeated with respect to the same evidence set, our methods can provide significant speedup to the learning procedure. We demonstrate the algorithm on genetic linkage problems where the use of value abstraction sometimes differentiates between a feasible and non-feasible solution.

preprint2013arXiv

On Testing Whether an Embedded Bayesian Network Represents a Probability Model

Testing the validity of probabilistic models containing unmeasured (hidden) variables is shown to be a hard task. We show that the task of testing whether models are structurally incompatible with the data at hand, requires an exponential number of independence evaluations, each of the form: "X is conditionally independent of Y, given Z." In contrast, a linear number of such evaluations is required to test a standard Bayesian network (one per vertex). On the positive side, we show that if a network with hidden variables G has a tree skeleton, checking whether G represents a given probability model P requires the polynomial number of such independence evaluations. Moreover, we provide an algorithm that efficiently constructs a tree-structured Bayesian network (with hidden variables) that represents P if such a network exists, and further recognizes when such a network does not exist.

preprint2013arXiv

On the Logic of Causal Models

This paper explores the role of Directed Acyclic Graphs (DAGs) as a representation of conditional independence relationships. We show that DAGs offer polynomially sound and complete inference mechanisms for inferring conditional independence relationships from a given causal set of such relationships. As a consequence, d-separation, a graphical criterion for identifying independencies in a DAG, is shown to uncover more valid independencies then any other criterion. In addition, we employ the Armstrong property of conditional independence to show that the dependence relationships displayed by a DAG are inherently consistent, i.e. for every DAG D there exists some probability distribution P that embodies all the conditional independencies displayed in D and none other.

preprint2013arXiv

Perfect Tree-Like Markovian Distributions

We show that if a strictly positive joint probability distribution for a set of binary random variables factors according to a tree, then vertex separation represents all and only the independence relations enclosed in the distribution. The same result is shown to hold also for multivariate strictly positive normal distributions. Our proof uses a new property of conditional independence that holds for these two classes of probability distributions.

preprint2013arXiv

Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (1997)

This is the Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, which was held in Providence, RI, August 1-3, 1997

preprint2013arXiv

Quantifier Elimination for Statistical Problems

Recent improvement on Tarski's procedure for quantifier elimination in the first order theory of real numbers makes it feasible to solve small instances of the following problems completely automatically: 1. listing all equality and inequality constraints implied by a graphical model with hidden variables. 2. Comparing graphyical models with hidden variables (i.e., model equivalence, inclusion, and overlap). 3. Answering questions about the identification of a model or portion of a model, and about bounds on quantities derived from a model. 4. Determing whether a given set of independence assertions. We discuss the foundation of quantifier elimination and demonstrate its application to these problems.

preprint2012arXiv

A Distance-Based Branch and Bound Feature Selection Algorithm

There is no known efficient method for selecting k Gaussian features from n which achieve the lowest Bayesian classification error. We show an example of how greedy algorithms faced with this task are led to give results that are not optimal. This motivates us to propose a more robust approach. We present a Branch and Bound algorithm for finding a subset of k independent Gaussian features which minimizes the naive Bayesian classification error. Our algorithm uses additive monotonic distance measures to produce bounds for the Bayesian classification error in order to exclude many feature subsets from evaluation, while still returning an optimal solution. We test our method on synthetic data as well as data obtained from gene expression profiling.

preprint2012arXiv

Asymptotic Model Selection for Naive Bayesian Networks

We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a concrete example that the BIC score is generally not valid for statistical models that belong to a stratified exponential family. This stands in contrast to linear and curved exponential families, where the BIC score has been proven to provide a correct approximation for the marginal likelihood.

preprint2012arXiv

Automated Analytic Asymptotic Evaluation of the Marginal Likelihood for Latent Models

We present and implement two algorithms for analytic asymptotic evaluation of the marginal likelihood of data given a Bayesian network with hidden nodes. As shown by previous work, this evaluation is particularly hard for latent Bayesian network models, namely networks that include hidden variables, where asymptotic approximation deviates from the standard BIC score. Our algorithms solve two central difficulties in asymptotic evaluation of marginal likelihood integrals, namely, evaluation of regular dimensionality drop for latent Bayesian network models and computation of non-standard approximation formulas for singular statistics for these models. The presented algorithms are implemented in Matlab and Maple and their usage is demonstrated for marginal likelihood approximations for Bayesian networks with hidden variables.

preprint2012arXiv

Factorization of Discrete Probability Distributions

We formulate necessary and sufficient conditions for an arbitrary discrete probability distribution to factor according to an undirected graphical model, or a log-linear model, or other more general exponential models. This result generalizes the well known Hammersley-Clifford Theorem.

preprint2012arXiv

Importance Sampling via Variational Optimization

Computing the exact likelihood of data in large Bayesian networks consisting of thousands of vertices is often a difficult task. When these models contain many deterministic conditional probability tables and when the observed values are extremely unlikely even alternative algorithms such as variational methods and stochastic sampling often perform poorly. We present a new importance sampling algorithm for Bayesian networks which is based on variational techniques. We use the updates of the importance function to predict whether the stochastic sampling converged above or below the true likelihood, and change the proposal distribution accordingly. The validity of the method and its contribution to convergence is demonstrated on hard networks of large genetic linkage analysis tasks.

Dan Geiger

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

Accurate Liability Estimation Improves Power in Ascertained Case Control Studies

Dependence and Relevance: A probabilistic view

Advances in Probabilistic Reasoning

Asymptotic Model Selection for Directed Networks with Hidden Variables

Inference Algorithms for Similarity Networks

Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Separable and transitive graphoids

Random Algorithms for the Loop Cutset Problem

A Characterization of the Dirichlet Distribution with Application to Learning Bayesian Networks

A Sufficiently Fast Algorithm for Finding Close to Optimal Junction Trees

An Entropy-based Learning Algorithm of Bayesian Conditional Trees

Approximation Algorithms for the Loop Cutset Problem

d-Separation: From Theorems to Algorithms

Graphical Models and Exponential Families

Likelihood Computations Using Value Abstractions

On Testing Whether an Embedded Bayesian Network Represents a Probability Model

On the Logic of Causal Models

Perfect Tree-Like Markovian Distributions

Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (1997)

Quantifier Elimination for Statistical Problems

A Distance-Based Branch and Bound Feature Selection Algorithm

Asymptotic Model Selection for Naive Bayesian Networks

Automated Analytic Asymptotic Evaluation of the Marginal Likelihood for Latent Models

Factorization of Discrete Probability Distributions

Importance Sampling via Variational Optimization