Machine Learning in Genetic Studies
Author Information
Author(s): Nonyane Bareng AS, Andrea S Foulkes
Primary Institution: University of Massachusetts Amherst
Hypothesis
How do machine learning algorithms perform in genetic association studies when considering covariates?
Conclusion
The study concludes that both Random Forests and MARS are effective for selecting significant predictors, but careful handling of gene-covariate-trait relationships is crucial.
Supporting Evidence
- Machine learning algorithms can effectively identify significant predictors in genetic studies.
- Random Forests and MARS showed different strengths in handling covariates.
- Stratifying by race/ethnicity revealed different SNP rankings in the analysis.
Takeaway
This study looks at how to use computer programs to find connections between genes and traits while considering other factors like race.
Methodology
The study used simulation studies and applied Random Forests and MARS algorithms to genetic data.
Potential Biases
Potential biases due to confounding and effect mediation were noted.
Limitations
The study does not consider the impacts of genotyping errors, missing data, and variable penetrance.
Participant Demographics
The study included 512 individuals, with 317 Whites/non-Hispanic, 92 Blacks/non-Hispanic, and 103 Hispanics.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website