Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles
2008

Using Decision Tree Ensembles to Classify Premalignant Pancreatic Cancer Data

Sample size: 181 publication Evidence: moderate

Author Information

Author(s): Ge Guangtao, Wong G William

Primary Institution: Department of Computer Science, Tufts University; Department of Physiology and the Center for Metabolism and Obesity Research, Johns Hopkins University School of Medicine

Hypothesis

Can decision tree ensembles improve the classification of proteomics data from premalignant pancreatic cancer?

Conclusion

Classifier ensembles generally outperform single decision trees in classifying proteomics data from premalignant pancreatic cancer.

Supporting Evidence

  • Classifier ensembles showed better accuracy and lower prediction errors compared to single decision trees.
  • Three feature selection methods were used to identify significant biomarkers.
  • Ensemble methods like Bagging and Random Forest outperformed individual classifiers.

Takeaway

This study shows that using groups of decision trees together can help better identify pancreatic cancer from blood samples.

Methodology

The study used a 10-fold cross-validation framework with various decision tree ensemble methods and feature selection techniques on proteomics data.

Limitations

The classifiers performed lower than expected, possibly due to the early stage of cancer in the dataset and the use of default parameters.

Participant Demographics

The dataset included serum samples from 33 mice with premalignant pancreatic intraepithelial neoplasias and 39 age-matched control mice.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-275

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication