Systematic feature evaluation for gene name recognition
2005

Evaluating Gene Name Recognition Methods

Sample size: 10000 publication Evidence: moderate

Author Information

Author(s): Jörg Hakenberg, Steffen Bickel, Conrad Plake, Ulf Brefeld, Hagen Zahn, Lukas Faulstich, Ulf Leser, Tobias Scheffer

Primary Institution: Humboldt-Universität zu Berlin

Hypothesis

How can different feature sets improve the recognition of gene and protein names in text?

Conclusion

The study found that using a systematic approach to feature evaluation significantly improves the performance of gene name recognition systems.

Supporting Evidence

  • The system achieved a precision of 71.4% and a recall of 72.8%.
  • Recursive feature elimination improved performance by 0.7%.
  • Using fewer than 5% of the features resulted in performance only 2.3% below the maximum.

Takeaway

The researchers created a system to help computers recognize gene names in sentences, and they found that choosing the right features makes a big difference.

Methodology

The study used a Support Vector Machine and recursive feature elimination to evaluate different feature sets for gene name recognition.

Limitations

The study's results may not generalize to all types of biomedical texts due to the specific nature of the training data.

Digital Object Identifier (DOI)

10.1186/1471-2105-6-S1-S9

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication