BioCreative II Gene Mention Recognition Overview
Author Information
Author(s): Smith Larry, Tanabe Lorraine K, Ando Rie, Kuo Cheng-Ju, Chung I-Fang, Hsu Chun-Nan, Lin Yu-Shi, Klinger Roman, Friedrich Christoph M, Ganchev Kuzman, Torii Manabu, Liu Hongfang, Haddow Barry, Struble Craig A, Povinelli Richard J, Vlachos Andreas, Baumgartner William A Jr, Hunter Lawrence, Carpenter Bob, Tsai Richard Tzong-Han, Dai Hong-Jie, Liu Feng, Chen Yifei, Sun Chengjie, Katrenko Sophia, Adriaans Pieter, Blaschke Christian, Torres Rafael, Neves Mariana, Nakov Preslav, Divoli Anna, Maña-López Manuel, Mata Jacinto, Wilbur W John
Primary Institution: National Center for Biotechnology Information
Hypothesis
Can different systems effectively identify gene name mentions in scientific text?
Conclusion
The highest achieved F1 score for gene mention recognition was 0.8721, with a combined score of 0.9066 possible by merging results from all submissions.
Supporting Evidence
- Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop.
- The highest achieved F1 score was 0.8721.
- Combining results from all submissions yielded an F score of 0.9066.
- Statistical analysis showed significant differences in performance among the top teams.
- Performance measures included precision, recall, and F score.
- Bootstrap resampling was used to estimate statistical significance.
- Each team was allowed to submit multiple runs for evaluation.
- Annotation guidelines for gene mentions were found to be complex and influenced performance.
Takeaway
This study shows that many teams can create systems to find gene names in text, and when combined, these systems can do even better.
Methodology
Participants designed systems to identify gene name mentions in sentences using various methods and evaluated their performance based on precision, recall, and F1 score.
Potential Biases
The results may not generalize to other datasets due to the specific nature of the training and testing corpora.
Limitations
The absolute effectiveness measures are not meaningful outside the context of the challenge.
Participant Demographics
Nineteen teams participated in the challenge, each submitting multiple runs.
Statistical Information
P-Value
0.0123
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website