A simple approach for protein name identification: prospects and limits
2005

A Simple Approach for Protein Name Identification

Sample size: 250 publication Evidence: moderate

Author Information

Author(s): Katrin Fundel, Daniel Güttler, Ralf Zimmer, Joannis Apostolakis

Primary Institution: Institut für Informatik, Ludwig-Maximilians-Universität München

Hypothesis

Can a simple and efficient approach identify gene and protein names in texts and return database identifiers for matches?

Conclusion

The approach showed high recall and precision, with results close to the best submissions in the BioCreAtIvE evaluation.

Supporting Evidence

  • The method achieved F-measures of 0.897 for yeast and 0.764/0.773 for mouse.
  • The results for fly were 0.768 in a post-evaluation.
  • High recall and precision were noted in the BioCreAtIvE assessment.

Takeaway

This study created a method to find protein names in scientific texts, which helps researchers gather information more easily.

Methodology

The method used synonym lists for gene/protein names and matched them against MEDLINE abstracts using exact text matching.

Limitations

The method struggled with the complex nomenclature of fly proteins, leading to lower precision.

Digital Object Identifier (DOI)

10.1186/1471-2105-6-S1-S15

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication