An evaluation of GO annotation retrieval for BioCreAtIvE and GOA
2005

Evaluating GO Annotation Retrieval for BioCreAtIvE

Sample size: 286 publication Evidence: high

Author Information

Author(s): Camon Evelyn B, Barrell Daniel G, Dimmer Emily C, Lee Vivian, Magrane Michele, Maslen John, Binns David, Apweiler Rolf

Primary Institution: European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI)

Hypothesis

Can automatically derived classification using information retrieval and extraction assist expert biologists in annotating GO vocabulary to proteins?

Conclusion

Improvements in the performance and accuracy of text mining for GO terms should be expected in the next BioCreAtIvE challenge.

Supporting Evidence

  • The GOA database extracts GO annotation from literature with 91 to 100% precision.
  • Manual curation provides more reliable and detailed GO annotation but is slower and more labor-intensive.
  • BioCreAtIvE task 2 showed that text mining systems predicted GO terms only 10 to 20% of the time.

Takeaway

This study looked at how well computers can help scientists label proteins with information from research papers, and found that while computers are getting better, humans still do a better job.

Methodology

The study involved manual evaluation of GO annotations from literature and comparison with automatic text mining techniques.

Potential Biases

Potential bias in the selection of articles and GO terms due to the manual curation process.

Limitations

The study faced challenges in the precision of automatic GO term predictions and the reliance on a single article for annotations.

Digital Object Identifier (DOI)

10.1186/1471-2105-6-S1-S17

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication