We discuss Bayesian methods for learning Bayesian networks when data sets are incomplete. In particular, we examine asymptotic approximations for the marginal likelihood of incomplete data given a Bayesian network. We consider the Laplace approximation and the less accurate but more efficient BIC/MDL approximation. We also consider approximations proposed by Draper (1993) and Cheeseman and Stutz (1995). These approximations are as efficient as BIC/MDL, but their accuracy has not been studied in any depth. We compare the accuracy of these approximations under the assumption that the Laplace approximation is the most accurate. In experiments using synthetic data generated from discrete naive-Bayes models having a hidden root node, we find that the CS measure is the most accurate.
Heckerman (1993) defined causal independence in terms of a set of temporal conditional independence statements. These statements formalized certain types of causal interaction where (1) the effect is independent of the order that causes are introduced and (2) the impact of a single cause on the effect does not depend on what other causes have previously been applied. In this paper, we introduce an equivalent a temporal characterization of causal independence based on a functional representation of the relationship between causes and the effect. In this representation, the interaction between causes and effect can be written as a nested decomposition of functions. Causal independence can be exploited by representing this decomposition in the belief network, resulting in representations that are more efficient for inference than general causal models. We present empirical results showing the benefits of a causal-independence representation for belief-network inference.
In this paper, an empirical evaluation of three inference methods for uncertain reasoning is presented in the context of Pathfinder, a large expert system for the diagnosis of lymph-node pathology. The inference procedures evaluated are (1) Bayes' theorem, assuming evidence is conditionally independent given each hypothesis; (2) odds-likelihood updating, assuming evidence is conditionally independent given each hypothesis and given the negation of each hypothesis; and (3) a inference method related to the Dempster-Shafer theory of belief. Both expert-rating and decision-theoretic metrics are used to compare the diagnostic accuracy of the inference methods.
We examine three probabilistic formulations of the sentence a and b are totally unrelated with respect to a given set of variables U. First, two variables a and b are totally independent if they are independent given any value of any subset of the variables in U. Second, two variables are totally uncoupled if U can be partitioned into two marginally independent sets containing a and b respectively. Third, two variables are totally disconnected if the corresponding nodes are disconnected in every belief network representation. We explore the relationship between these three formulations of unrelatedness and explain their relevance to the process of acquiring probabilistic knowledge from human experts.
A similarity network is a tool for constructing belief networks for the diagnosis of a single fault. In this paper, we examine modifications to the similarity-network representation that facilitate the construction of belief networks for the diagnosis of multiple coexisting faults.
This paper discuses multiple Bayesian networks representation paradigms for encoding asymmetric independence assertions. We offer three contributions: (1) an inference mechanism that makes explicit use of asymmetric independence to speed up computations, (2) a simplified definition of similarity networks and extensions of their theory, and (3) a generalized representation scheme that encodes more types of asymmetric independence assertions than do similarity networks.
Value-of-information analyses provide a straightforward means for selecting the best next observation to make, and for determining whether it is better to gather additional information or to act immediately. Determining the next best test to perform, given a state of uncertainty about the world, requires a consideration of the value of making all possible sequences of observations. In practice, decision analysts and expert-system designers have avoided the intractability of exact computation of the value of information by relying on a myopic approximation. Myopic analyses are based on the assumption that only one additional test will be performed, even when there is an opportunity to make a large number of observations. We present a nonmyopic approximation for value of information that bypasses the traditional myopic analyses by exploiting the statistical properties of large samples.
In this paper, we extend the QMRDT probabilistic model for the domain of internal medicine to include decisions about treatments. In addition, we describe how we can use the comprehensive decision model to construct a simpler decision model for a specific patient. In so doing, we transform the task of problem formulation to that of narrowing of a larger problem.
We compare the diagnostic accuracy of three diagnostic inference models: the simple Bayes model, the multimembership Bayes model, which is isomorphic to the parallel combination function in the certainty-factor model, and a model that incorporates the noisy OR-gate interaction. The comparison is done on 20 clinicopathological conference (CPC) cases from the American Journal of Medicine-challenging cases describing actual patients often with multiple disorders. We find that the distributions produced by the noisy OR model agree most closely with the gold-standard diagnoses, although substantial differences exist between the distributions and the diagnoses. In addition, we find that the multimembership Bayes model tends to significantly overestimate the posterior probabilities of diseases, whereas the simple Bayes model tends to significantly underestimate the posterior probabilities. Our results suggest that additional work to refine the noisy OR model for internal medicine will be worthwhile.
I introduce a temporal belief-network representation of causal independence that a knowledge engineer can use to elicit probabilistic models. Like the current, atemporal belief-network representation of causal independence, the new representation makes knowledge acquisition tractable. Unlike the atemproal representation, however, the temporal representation can simplify inference, and does not require the use of unobservable variables. The representation is less general than is the atemporal representation, but appears to be useful for many practical applications.