Improving the prediction accuracy in classification using the combined data sets by ranks of gene expressions
2008

Improving Prediction Accuracy in Gene Classification

Sample size: 154 publication Evidence: moderate

Author Information

Author(s): Kim Ki-Yeol, Ki Dong Hyuk, Jeung Hei-Cheul, Chung Hyun Cheol, Rha Sun Young

Primary Institution: Yonsei University

Hypothesis

Can combining gene expression data sets improve prediction accuracy in classification tasks?

Conclusion

The proposed method improves prediction accuracy by combining gene expression data sets adjusted for systematic biases.

Supporting Evidence

  • The combined data set predicted test data sets more accurately than the separated data sets.
  • Biologically significant genes were detected from the combined data set that were missed in the separated data sets.
  • The method is robust against systematic biases among different gene expression platforms.

Takeaway

This study shows that by combining different gene expression data sets, we can make better predictions about cancer-related genes.

Methodology

The study used a method that combines gene expression data sets after transforming them into ranks, followed by applying a nonparametric statistical method for gene selection.

Potential Biases

Potential biases from different experimental conditions and RNA sources may still affect results.

Limitations

The study may not account for all types of biases present in gene expression data.

Participant Demographics

The study involved colorectal tissues from 154 samples, including 72 normal and 82 tumor samples.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-283

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication