|
|
|
Simon Tong |
|
Daphne Koller |
|
|
|
Stanford University |
|
|
|
|
|
Scientific understanding » causal model |
|
Understand process of cancer formation &
progress |
|
|
|
How do we choose experiments wisely to gain the
most understanding from them |
|
Limited number of experiments on laboratory mice |
|
Which experiments should we perform? |
|
|
|
|
|
|
|
|
|
|
|
|
Given: |
|
Set of domain variables |
|
|
|
Experiment: |
|
Can control some query variables Q by setting Q
:= q |
|
Obtain a response X=x — a complete assignment to
the other variables |
|
|
|
Goal: |
|
Learn a causal model — structure of causal
Bayesian network |
|
|
|
|
|
|
|
A model |
|
Our current hypothesis |
|
|
|
A measure of quality |
|
How good our model is |
|
|
|
|
|
Bayesian Estimation |
|
Maintain distribution over graphs and params
P(G,qG) |
|
|
|
|
|
|
|
|
|
|
Given a new data instance (Q:=q, x): |
|
Update P(G,qG) to get posterior P(G,qG | Q:=q, x) |
|
P(G,qG | Q:=q, x) is our new model |
|
A distribution over graphs and parameters |
|
|
|
Given a model P(G,qG) in principle we can
obtain P(X®Y) |
|
|
|
|
|
|
|
|
Want to be as certain about the structure as
possible |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Use the loss function: H(B«A) = |
|
|
|
|
In general: |
|
|
|
|
|
|
|
|
|
Want to be as certain about edges as possible |
|
|
|
|
|
|
Choose query that will improve expected
posterior loss as much as possible: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Restrict to networks consistent with a fixed
total ordering p of the nodes* |
|
A®B Þ A p B |
|
|
|
Each node Xi has a set Wi
of at most k possible candidate parents** |
|
|
|
Number of allowed structures still exponential |
|
|
|
|
|
Choice of parents for node X now independent of
choice of parents for other nodes |
|
Intuitively, because cycles no longer an issue |
|
|
|
Can manipulate P(G | D, p) in closed form |
|
Can compute Hp(A«B) efficiently |
|
|
|
|
Theorem: |
|
|
|
|
|
|
|
|
|
|
|
where
both y and f have efficiently
computable closed form expressions |
|
|
|
|
Each term y, f is a potential function over a small subset of
variables |
|
Desired result is sum over all variables except Q |
|
Can use standard inference techniques |
|
|
|
|
We can closed this efficiently in closed form |
|
|
|
|
|
Sample orderings* |
|
Use MCMC Metropolis Algorithm |
|
|
|
Suppose want to compute expected posterior loss
of asking query Q:=q |
|
For each ordering, compute expected quality |
|
Average over all sampled orderings |
|
|
|
|
Sample orderings |
|
Compute ExPLoss for each query |
|
Choose query that minimizes ExPLoss |
|
|
|
|
|
Cancer |
|
Network |
|
5 nodes |
|
|
|
Candidate parents of X: |
|
k £ 5 parents for each node X |
|
nodes with highest individual mutual information
with X |
|
|
|
MCMC for sampling orderings |
|
Maintain 50 – 75 chains |
|
|
|
Compared Active with Random and Uniform |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Framework for active learning for causal
structure |
|
Theoretical analysis supported by experimental
results |
|
|
|
Efficient algorithm |
|
Closed form distribution for fixed ordering |
|
MCMC samples of orderings |
|
Use of graphical model inference to perform
exponential computation |
|
|
|
Future work: missing data, continuous variables |
|