


Simon Tong 

Daphne Koller 



Stanford University 





Scientific understanding » causal model 

Understand process of cancer formation &
progress 



How do we choose experiments wisely to gain the
most understanding from them 

Limited number of experiments on laboratory mice 

Which experiments should we perform? 












Given: 

Set of domain variables 



Experiment: 

Can control some query variables Q by setting Q
:= q 

Obtain a response X=x — a complete assignment to
the other variables 



Goal: 

Learn a causal model — structure of causal
Bayesian network 







A model 

Our current hypothesis 



A measure of quality 

How good our model is 





Bayesian Estimation 

Maintain distribution over graphs and params
P(G,q_{G}) 










Given a new data instance (Q:=q, x): 

Update P(G,q_{G}) to get posterior P(G,q_{G } Q:=q, x) 

P(G,q_{G } Q:=q, x) is our new model 

A distribution over graphs and parameters 



Given a model P(G,q_{G}) in principle we can
obtain P(X®Y) 








Want to be as certain about the structure as
possible 

















Use the loss function: H(B«A) = 




In general: 









Want to be as certain about edges as possible 






Choose query that will improve expected
posterior loss as much as possible: 















Restrict to networks consistent with a fixed
total ordering p of the nodes* 

A®B Þ A p B 



Each node X_{i }has a set W_{i}
of at most k possible candidate parents** 



Number of allowed structures still exponential 





Choice of parents for node X now independent of
choice of parents for other nodes 

Intuitively, because cycles no longer an issue 



Can manipulate P(G  D, p) in closed form 

Can compute H_{p}(A«B) efficiently 




Theorem: 











where
both y and f have efficiently
computable closed form expressions 




Each term y, f is a potential function over a small subset of
variables 

Desired result is sum over all variables except Q 

Can use standard inference techniques 




We can closed this efficiently in closed form 





Sample orderings* 

Use MCMC Metropolis Algorithm 



Suppose want to compute expected posterior loss
of asking query Q:=q 

For each ordering, compute expected quality 

Average over all sampled orderings 




Sample orderings 

Compute ExPLoss for each query 

Choose query that minimizes ExPLoss 





Cancer 

Network 

5 nodes 



Candidate parents of X: 

k £ 5 parents for each node X 

nodes with highest individual mutual information
with X 



MCMC for sampling orderings 

Maintain 50 – 75 chains 



Compared Active with Random and Uniform 














Framework for active learning for causal
structure 

Theoretical analysis supported by experimental
results 



Efficient algorithm 

Closed form distribution for fixed ordering 

MCMC samples of orderings 

Use of graphical model inference to perform
exponential computation 



Future work: missing data, continuous variables 
