Using Decision Tree Ensembles to Classify Premalignant Pancreatic Cancer Data
Author Information
Author(s): Ge Guangtao, Wong G William
Primary Institution: Department of Computer Science, Tufts University; Department of Physiology and the Center for Metabolism and Obesity Research, Johns Hopkins University School of Medicine
Hypothesis
Can decision tree ensembles improve the classification of proteomics data from premalignant pancreatic cancer?
Conclusion
Classifier ensembles generally outperform single decision trees in classifying proteomics data from premalignant pancreatic cancer.
Supporting Evidence
- Classifier ensembles showed better accuracy and lower prediction errors compared to single decision trees.
- Three feature selection methods were used to identify significant biomarkers.
- Ensemble methods like Bagging and Random Forest outperformed individual classifiers.
Takeaway
This study shows that using groups of decision trees together can help better identify pancreatic cancer from blood samples.
Methodology
The study used a 10-fold cross-validation framework with various decision tree ensemble methods and feature selection techniques on proteomics data.
Limitations
The classifiers performed lower than expected, possibly due to the early stage of cancer in the dataset and the use of default parameters.
Participant Demographics
The dataset included serum samples from 33 mice with premalignant pancreatic intraepithelial neoplasias and 39 age-matched control mice.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website