Choosing the Best Cancer Prediction Models
Author Information
Author(s): Crawford Jake, Chikina Maria, Greene Casey S.
Primary Institution: University of Pennsylvania
Hypothesis
Do smaller gene signatures generalize better than larger ones in cancer transcriptomics?
Conclusion
The study found that the best-performing models on held-out data generalize well across different biological contexts, regardless of model size.
Supporting Evidence
- Smaller models do not consistently generalize better than larger models.
- Cross-validation performance is a reliable indicator of model generalization.
- Results were consistent across different cancer types and datasets.
Takeaway
This study shows that when predicting cancer, it's better to pick models that work well on new data instead of just choosing smaller models.
Methodology
The study used LASSO logistic regression and neural networks to evaluate model generalization across datasets and cancer types.
Limitations
The study may not generalize to all cancer types or datasets, as it focused on specific public datasets.
Participant Demographics
The study analyzed data from human tumor samples and cancer cell lines.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website