Application of an efficient Bayesian discretization method to biomedical data
2011

Efficient Bayesian Discretization Method for Biomedical Data

Sample size: 24 publication 10 minutes Evidence: moderate

Author Information

Author(s): Jonathan L Lustgarten, Shyam Visweswaran, Vanathi Gopalakrishnan, Gregory F Cooper

Primary Institution: University of Pittsburgh

Hypothesis

Can an efficient Bayesian discretization method improve classification performance on high-dimensional biomedical datasets compared to traditional methods?

Conclusion

The efficient Bayesian discretization method (EBD) outperformed the traditional Fayyad and Irani method in classification performance and stability on various biomedical datasets.

Supporting Evidence

  • EBD showed a statistically significant increase in accuracy over FI on 17 out of 24 datasets.
  • EBD was statistically significantly more stable than FI.
  • EBD produced slightly more complex discretizations than FI.

Takeaway

This study shows that a new method for turning continuous data into categories can help computers make better guesses about health conditions using medical data.

Methodology

The study compared the performance of the EBD method against the FI method on 24 biomedical datasets using classifiers like C4.5 and naïve Bayes, employing 10-fold cross-validation.

Limitations

EBD was less robust than FI, though not statistically significantly so, and it took longer to compute.

Statistical Information

P-Value

0.026

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-12-309

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication