A Simple Approach for Protein Name Identification
Author Information
Author(s): Katrin Fundel, Daniel Güttler, Ralf Zimmer, Joannis Apostolakis
Primary Institution: Institut für Informatik, Ludwig-Maximilians-Universität München
Hypothesis
Can a simple and efficient approach identify gene and protein names in texts and return database identifiers for matches?
Conclusion
The approach showed high recall and precision, with results close to the best submissions in the BioCreAtIvE evaluation.
Supporting Evidence
- The method achieved F-measures of 0.897 for yeast and 0.764/0.773 for mouse.
- The results for fly were 0.768 in a post-evaluation.
- High recall and precision were noted in the BioCreAtIvE assessment.
Takeaway
This study created a method to find protein names in scientific texts, which helps researchers gather information more easily.
Methodology
The method used synonym lists for gene/protein names and matched them against MEDLINE abstracts using exact text matching.
Limitations
The method struggled with the complex nomenclature of fly proteins, leading to lower precision.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website