Clustering DNA Methylation Data Using a New Algorithm
Author Information
Author(s): E. Houseman, B. C. Christensen, R. F. Yeh, C. J. Marsit, M. R. Karagas, M. Wrensch, H. H. Nelson, J. Wiemels, S. Zheng, J. K. Wiencke, K. T. Kelsey
Primary Institution: Harvard School of Public Health
Hypothesis
How can we effectively cluster DNA methylation data from high-dimensional arrays?
Conclusion
The proposed method is an effective and computationally efficient way to cluster DNA methylation data.
Supporting Evidence
- The proposed method outperformed nonparametric clustering approaches in simulations.
- The method was computationally efficient compared to conventional mixture model methods.
- Clusters identified were associated with tissue type and age.
Takeaway
The researchers created a new way to group DNA data that helps scientists understand how genes are turned on or off in different tissues.
Methodology
The study used a recursive-partitioning algorithm based on a beta mixture model to cluster DNA methylation data from normal tissue samples.
Potential Biases
Potential biases from plate-to-plate variability were noted, but not fully addressed.
Limitations
The study did not normalize different plates used in laboratory analysis, which may introduce variability.
Participant Demographics
The study included 217 normal tissue samples from various types, including blood, brain, and placenta, with a mix of adult and newborn samples.
Statistical Information
P-Value
<0.0001
Statistical Significance
p<0.0001
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website