Effectiveness of Bagging in Classifying Small-Sample Genomic and Proteomic Data
Author Information
Author(s): T. T. Vu, U. M. Braga-Neto
Primary Institution: Texas A&M University
Hypothesis
Does bagging improve the performance of unstable classifiers enough to surpass stable classifiers in small-sample genomic and proteomic data?
Conclusion
Bagging improves the performance of unstable classifiers but does not outperform stable classifiers in small-sample settings.
Supporting Evidence
- Bagging improved unstable classifiers like CART and NNET but not stable classifiers like DLDA and LDA.
- Feature selection methods like t-test outperformed RELIEF in small-sample settings.
- Ensemble methods showed diminishing returns in performance improvement with larger sizes.
Takeaway
Using bagging to classify small samples of gene and protein data helps some methods work better, but it doesn't make them better than simpler methods.
Methodology
The study used various classification rules and assessed their performance using bagging on publicly available genomic and proteomic datasets.
Potential Biases
Potential bias in feature selection methods and the reliance on specific datasets.
Limitations
The study primarily focused on small sample sizes, which may limit the generalizability of the findings.
Participant Demographics
Data from breast cancer, lung cancer, and prostate cancer studies were analyzed.
Statistical Information
P-Value
p<0.01
Statistical Significance
p<0.01
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website