Is Bagging Effective in the Classification of Small-Sample Genomic and Proteomic Data?
2009

Effectiveness of Bagging in Classifying Small-Sample Genomic and Proteomic Data

Sample size: 24 publication 10 minutes Evidence: moderate

Author Information

Author(s): T. T. Vu, U. M. Braga-Neto

Primary Institution: Texas A&M University

Hypothesis

Does bagging improve the performance of unstable classifiers enough to surpass stable classifiers in small-sample genomic and proteomic data?

Conclusion

Bagging improves the performance of unstable classifiers but does not outperform stable classifiers in small-sample settings.

Supporting Evidence

  • Bagging improved unstable classifiers like CART and NNET but not stable classifiers like DLDA and LDA.
  • Feature selection methods like t-test outperformed RELIEF in small-sample settings.
  • Ensemble methods showed diminishing returns in performance improvement with larger sizes.

Takeaway

Using bagging to classify small samples of gene and protein data helps some methods work better, but it doesn't make them better than simpler methods.

Methodology

The study used various classification rules and assessed their performance using bagging on publicly available genomic and proteomic datasets.

Potential Biases

Potential bias in feature selection methods and the reliance on specific datasets.

Limitations

The study primarily focused on small sample sizes, which may limit the generalizability of the findings.

Participant Demographics

Data from breast cancer, lung cancer, and prostate cancer studies were analyzed.

Statistical Information

P-Value

p<0.01

Statistical Significance

p<0.01

Digital Object Identifier (DOI)

10.1155/2009/158368

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication