Improving Protein Function Prediction with Literature Data
Author Information
Author(s): Gabow Aaron P, Leach Sonia M, Baumgartner William A, Hunter Lawrence E, Goldberg Debra S
Primary Institution: Department of Pharmacology, University of Colorado at Denver and Health Sciences Center
Hypothesis
Can literature co-occurrence data improve the accuracy of protein function prediction algorithms?
Conclusion
Co-occurrence data significantly enhances the performance of graph-theoretic function prediction algorithms.
Supporting Evidence
- Co-occurrence data improves function prediction across multiple organisms.
- Integrating literature data outperforms using genetic interaction data alone.
- Co-occurrence data provides critical links to well-studied regions in interaction networks.
Takeaway
This study shows that using information from scientific papers can help scientists figure out what unknown proteins do, making predictions more accurate.
Methodology
The study used graph-theoretic approaches to predict protein functions by integrating literature co-occurrence data with existing protein-protein interaction networks.
Potential Biases
There is a risk of false positives when asserting co-occurrence based on literature mentions.
Limitations
The study may be biased towards well-studied proteins due to the reliance on literature data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website