Density estimation

We do not know P(C0 | x) and P(C1 | x)
- Generative approach to classification typically estimates the joint P(C, x) e.g.,
  - estimate class priors: P(C0), P(C1)
  - estimate class conditionals: P(x | C0), P(x | C1)
- Bayes optimal classifier minimizes the Bayes error relative to the estimated distribution

We do not know P(C0 | x) and P(C1 | x)