Improving Prediction Accuracy in Gene Classification
Author Information
Author(s): Kim Ki-Yeol, Ki Dong Hyuk, Jeung Hei-Cheul, Chung Hyun Cheol, Rha Sun Young
Primary Institution: Yonsei University
Hypothesis
Can combining gene expression data sets improve prediction accuracy in classification tasks?
Conclusion
The proposed method improves prediction accuracy by combining gene expression data sets adjusted for systematic biases.
Supporting Evidence
- The combined data set predicted test data sets more accurately than the separated data sets.
- Biologically significant genes were detected from the combined data set that were missed in the separated data sets.
- The method is robust against systematic biases among different gene expression platforms.
Takeaway
This study shows that by combining different gene expression data sets, we can make better predictions about cancer-related genes.
Methodology
The study used a method that combines gene expression data sets after transforming them into ranks, followed by applying a nonparametric statistical method for gene selection.
Potential Biases
Potential biases from different experimental conditions and RNA sources may still affect results.
Limitations
The study may not account for all types of biases present in gene expression data.
Participant Demographics
The study involved colorectal tissues from 154 samples, including 72 normal and 82 tumor samples.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website