Integrating Multiple Data Sources for Gene Prioritization
Author Information
Author(s): Chen Yixuan, Wang Wenhui, Zhou Yingyao, Shields Robert, Chanda Sumit K., Elston Robert C., Li Jing
Primary Institution: Case Western Reserve University
Hypothesis
Can integrating multiple heterogeneous data sources improve gene prioritization for disease genes?
Conclusion
The proposed framework for gene prioritization consistently outperforms existing methods by integrating multiple data sources.
Supporting Evidence
- The proposed method was validated using a large-scale cross-validation analysis on 110 disease families.
- Results showed that the integrated approach outperformed existing state-of-the-art programs.
- A case study on Parkinson disease identified four candidate genes involved in the disease pathway.
Takeaway
This study shows that using different types of data together helps scientists find important genes related to diseases better than using just one type of data.
Methodology
The study used a framework that integrates gene-gene and gene-disease relationships from multiple data sources to rank candidate genes.
Potential Biases
Potential bias from incomplete or noisy data in individual sources.
Limitations
The results may be limited by the quality and completeness of the data sources used.
Participant Demographics
The study analyzed data from 110 disease families.
Statistical Information
P-Value
p<0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website