Application of two machine learning algorithms to genetic association studies in the presence of covariates
2008

Machine Learning in Genetic Studies

Sample size: 512 publication Evidence: moderate

Author Information

Author(s): Nonyane Bareng AS, Andrea S Foulkes

Primary Institution: University of Massachusetts Amherst

Hypothesis

How do machine learning algorithms perform in genetic association studies when considering covariates?

Conclusion

The study concludes that both Random Forests and MARS are effective for selecting significant predictors, but careful handling of gene-covariate-trait relationships is crucial.

Supporting Evidence

  • Machine learning algorithms can effectively identify significant predictors in genetic studies.
  • Random Forests and MARS showed different strengths in handling covariates.
  • Stratifying by race/ethnicity revealed different SNP rankings in the analysis.

Takeaway

This study looks at how to use computer programs to find connections between genes and traits while considering other factors like race.

Methodology

The study used simulation studies and applied Random Forests and MARS algorithms to genetic data.

Potential Biases

Potential biases due to confounding and effect mediation were noted.

Limitations

The study does not consider the impacts of genotyping errors, missing data, and variable penetrance.

Participant Demographics

The study included 512 individuals, with 317 Whites/non-Hispanic, 92 Blacks/non-Hispanic, and 103 Hispanics.

Digital Object Identifier (DOI)

10.1186/1471-2156-9-71

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication