The C1C2 Framework for Model Selection and Assessment
Author Information
Author(s): Eklund Martin, Spjuth Ola, Wikberg Jarl ES
Primary Institution: Uppsala University
Hypothesis
Can the C1C2 framework improve model selection and assessment in predictive modeling?
Conclusion
The C1C2 framework effectively identifies the true model and accurately assesses generalization error, even in complex datasets.
Supporting Evidence
- The C1C2 framework was shown to perform well in identifying the correct variable subset.
- It provided accurate estimates of generalization error even with highly correlated independent variables.
- Using prior knowledge about relevant variables improved model choice but reduced generalization error accuracy.
- The C1C2 framework outperformed repeated K-fold cross-validation in assessing generalization error.
Takeaway
The C1C2 framework helps scientists choose the best model for their data and check how well it works, even when the data is tricky.
Methodology
The C1C2 framework separates model selection from assessment using data partitioning and employs genetic algorithms and brute-force methods for model choice.
Potential Biases
Potential overfitting due to model complexity and assumptions about variable relevance.
Limitations
The framework's performance may vary with different datasets and assumptions about the number of relevant variables.
Statistical Information
P-Value
p<2.2 × 10^-16
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website