Evaluating Gene Name Recognition Methods
Author Information
Author(s): Jörg Hakenberg, Steffen Bickel, Conrad Plake, Ulf Brefeld, Hagen Zahn, Lukas Faulstich, Ulf Leser, Tobias Scheffer
Primary Institution: Humboldt-Universität zu Berlin
Hypothesis
How can different feature sets improve the recognition of gene and protein names in text?
Conclusion
The study found that using a systematic approach to feature evaluation significantly improves the performance of gene name recognition systems.
Supporting Evidence
- The system achieved a precision of 71.4% and a recall of 72.8%.
- Recursive feature elimination improved performance by 0.7%.
- Using fewer than 5% of the features resulted in performance only 2.3% below the maximum.
Takeaway
The researchers created a system to help computers recognize gene names in sentences, and they found that choosing the right features makes a big difference.
Methodology
The study used a Support Vector Machine and recursive feature elimination to evaluate different feature sets for gene name recognition.
Limitations
The study's results may not generalize to all types of biomedical texts due to the specific nature of the training data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website