Overview of the BioCreative II Protein-Protein Interaction Annotation Extraction Task
Author Information
Author(s): Martin Krallinger, Florian Leitner, Carlos Rodriguez-Penagos, Alfonso Valencia
Primary Institution: Spanish National Cancer Research Centre (CNIO)
Hypothesis
How do different text-mining techniques compare in extracting protein-protein interactions from literature?
Conclusion
The BioCreative II PPI task successfully compared the performance of various text-mining tools for protein interaction extraction, revealing significant challenges and areas for improvement.
Supporting Evidence
- The top scoring team achieved an F-score of 0.78 in detecting relevant articles.
- Precision of 0.37 and recall of 0.33 were obtained for interaction pair extraction.
- 19% of submissions returned curator-selected sentences as evidence for interactions.
Takeaway
This study looked at how well computers can find information about proteins interacting with each other in scientific papers, and it found that there are still many challenges to overcome.
Methodology
The study involved a community challenge with four subtasks focusing on detecting relevant articles, extracting interaction pairs, identifying detection methods, and retrieving evidence passages.
Potential Biases
Potential biases in training data and the selection of journals for article curation could affect results.
Limitations
Challenges included issues with full-text format conversion, incomplete reference databases, and difficulties in linking proteins across sentences.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website