Comparative Study of Clustering Methods for Cancer Gene Expression Data

Sample size: 35 publication 10 minutes Evidence: high

Author Information

Author(s): de Souto Marcilio CP, Costa Ivan G, Araujo Daniel SA, Ludermir Teresa B, Schliep Alexander

Primary Institution: Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany

Which clustering methods perform best for analyzing cancer gene expression data?

The finite mixture of Gaussians and k-means methods showed the best performance in recovering the true structure of cancer gene expression data sets.

The finite mixture of Gaussians and k-means methods exhibited the best performance in recovering the true structure of the data sets.
Hierarchical methods showed poorer recovery performance compared to other methods evaluated.
A common group of benchmark data sets was provided for future comparisons of clustering methods.

This study looked at different ways to group cancer data and found that some methods work better than others for understanding cancer types.

The study compared seven clustering methods and four proximity measures using 35 cancer gene expression data sets.

The reliance on specific clustering methods may introduce bias in interpreting the results.

The study primarily focused on clustering methods and did not explore other potential factors affecting clustering performance.

The data sets included various cancer types from different tissues, but specific demographic details were not provided.

p<0.05

p<0.05

Access the complete publication on the publisher's website