Evaluating GO Annotation Retrieval for BioCreAtIvE
Author Information
Author(s): Camon Evelyn B, Barrell Daniel G, Dimmer Emily C, Lee Vivian, Magrane Michele, Maslen John, Binns David, Apweiler Rolf
Primary Institution: European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI)
Hypothesis
Can automatically derived classification using information retrieval and extraction assist expert biologists in annotating GO vocabulary to proteins?
Conclusion
Improvements in the performance and accuracy of text mining for GO terms should be expected in the next BioCreAtIvE challenge.
Supporting Evidence
- The GOA database extracts GO annotation from literature with 91 to 100% precision.
- Manual curation provides more reliable and detailed GO annotation but is slower and more labor-intensive.
- BioCreAtIvE task 2 showed that text mining systems predicted GO terms only 10 to 20% of the time.
Takeaway
This study looked at how well computers can help scientists label proteins with information from research papers, and found that while computers are getting better, humans still do a better job.
Methodology
The study involved manual evaluation of GO annotations from literature and comparison with automatic text mining techniques.
Potential Biases
Potential bias in the selection of articles and GO terms due to the manual curation process.
Limitations
The study faced challenges in the precision of automatic GO term predictions and the reliance on a single article for annotations.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website