New Algorithm for Clustering Gene Expression Data with Clinical Information
Author Information
Author(s): Bushel Pierre R, Wolfinger Russell D, Gibson Greg
Primary Institution: National Center for Toxicogenomics, National Institute of Environmental Health Sciences
Hypothesis
Can the modk-prototypes algorithm effectively cluster gene expression data with clinical chemistry and histopathological evaluations?
Conclusion
The modk-prototypes algorithm successfully clustered data, achieving an accuracy of 79% in distinguishing between heart disease samples.
Supporting Evidence
- The modk-prototypes algorithm achieved an accuracy of 79% in clustering heart disease samples.
- The algorithm effectively distinguished between different levels of necrosis in rat liver samples.
- Clustering results were validated using the adjusted Rand index, showing good agreement with histopathological evaluations.
Takeaway
Researchers created a new way to group data about genes and health to better understand diseases, and it worked really well.
Methodology
The study used the modk-prototypes algorithm to cluster gene expression data alongside clinical and histopathological data.
Potential Biases
Potential biases in weighting the different data domains could affect clustering results.
Limitations
The study may not generalize to all types of data or diseases.
Participant Demographics
The study involved 303 patients from the Cleveland Clinic heart disease database.
Statistical Information
P-Value
0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website