Best holdout assessment is sufficient for cancer transcriptomic model selection
2024

Choosing the Best Cancer Prediction Models

Sample size: 1402 publication 10 minutes Evidence: moderate

Author Information

Author(s): Crawford Jake, Chikina Maria, Greene Casey S.

Primary Institution: University of Pennsylvania

Hypothesis

Do smaller gene signatures generalize better than larger ones in cancer transcriptomics?

Conclusion

The study found that the best-performing models on held-out data generalize well across different biological contexts, regardless of model size.

Supporting Evidence

  • Smaller models do not consistently generalize better than larger models.
  • Cross-validation performance is a reliable indicator of model generalization.
  • Results were consistent across different cancer types and datasets.

Takeaway

This study shows that when predicting cancer, it's better to pick models that work well on new data instead of just choosing smaller models.

Methodology

The study used LASSO logistic regression and neural networks to evaluate model generalization across datasets and cancer types.

Limitations

The study may not generalize to all cancer types or datasets, as it focused on specific public datasets.

Participant Demographics

The study analyzed data from human tumor samples and cancer cell lines.

Digital Object Identifier (DOI)

10.1016/j.patter.2024.101115

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication