Efficient Bayesian Discretization Method for Biomedical Data
Author Information
Author(s): Jonathan L Lustgarten, Shyam Visweswaran, Vanathi Gopalakrishnan, Gregory F Cooper
Primary Institution: University of Pittsburgh
Hypothesis
Can an efficient Bayesian discretization method improve classification performance on high-dimensional biomedical datasets compared to traditional methods?
Conclusion
The efficient Bayesian discretization method (EBD) outperformed the traditional Fayyad and Irani method in classification performance and stability on various biomedical datasets.
Supporting Evidence
- EBD showed a statistically significant increase in accuracy over FI on 17 out of 24 datasets.
- EBD was statistically significantly more stable than FI.
- EBD produced slightly more complex discretizations than FI.
Takeaway
This study shows that a new method for turning continuous data into categories can help computers make better guesses about health conditions using medical data.
Methodology
The study compared the performance of the EBD method against the FI method on 24 biomedical datasets using classifiers like C4.5 and naïve Bayes, employing 10-fold cross-validation.
Limitations
EBD was less robust than FI, though not statistically significantly so, and it took longer to compute.
Statistical Information
P-Value
0.026
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website