A robust approach based on Weibull distribution for clustering gene expression data
2011

A New Method for Clustering Gene Expression Data

Sample size: 402 publication Evidence: high

Author Information

Author(s): Wang Huakun, Wang Zhenzhen, Li Xia, Gong Binsheng, Feng Lixin, Zhou Ying

Primary Institution: Harbin Medical University

Hypothesis

The Weibull Distribution-based Clustering Method (WDCM) can effectively cluster gene expression data by considering gene expressions as random variables following unique Weibull distributions.

Conclusion

The WDCM produces clusters with more consistent functional annotations than traditional methods like k-means and SOM.

Supporting Evidence

  • The WDCM showed higher functional annotation ratios compared to k-means and SOM.
  • The WDCM can cluster incomplete gene expression data without imputing missing values.
  • The Adjusted Rand Index indicated that WDCM clusters are more similar to external criteria than those from other methods.

Takeaway

This study introduces a new way to group genes based on their expression patterns, which helps scientists understand how genes work together in diseases like cancer.

Methodology

The WDCM clusters gene expression data by treating gene expressions as random variables following Weibull distributions and uses a hub nodes-based dynamic clustering algorithm.

Limitations

The method may disregard genes whose distributions do not fit the Weibull distribution.

Participant Demographics

The study involved gene expression data from lung cancer, B-cell follicular lymphoma, and bladder carcinoma.

Statistical Information

P-Value

p<0.05

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1748-7188-6-14

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication