Recognizing Protein and Gene Names from Text
Author Information
Author(s): Zhou GuoDong, Shen Dan, Zhang Jie, Su Jian, Tan SoonHeng
Primary Institution: Institute for Infocomm Research
Hypothesis
Can an ensemble of classifiers improve the recognition of protein and gene names in biomedical texts?
Conclusion
The proposed system achieved the best performance among competitors with an F-measure of 82.58 in recognizing protein and gene names.
Supporting Evidence
- The system outperformed 10 other systems in the BioCreative competition.
- It achieved a balanced F-measure of 82.58.
- The ensemble approach combined different classifiers to improve recognition accuracy.
Takeaway
This study created a smart system that helps computers understand names of proteins and genes in scientific texts, making it easier for scientists to find important information.
Methodology
An ensemble of classifiers including SVM and DHMMs was used, combined with post-processing modules for improved performance.
Limitations
The system's performance may be affected by the ambiguity in public resources and the complexity of biomedical names.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website