BioCreative II Gene Mention Recognition Overview

Sample size: 19 publication 10 minutes Evidence: high

Author Information

Author(s): Smith Larry, Tanabe Lorraine K, Ando Rie, Kuo Cheng-Ju, Chung I-Fang, Hsu Chun-Nan, Lin Yu-Shi, Klinger Roman, Friedrich Christoph M, Ganchev Kuzman, Torii Manabu, Liu Hongfang, Haddow Barry, Struble Craig A, Povinelli Richard J, Vlachos Andreas, Baumgartner William A Jr, Hunter Lawrence, Carpenter Bob, Tsai Richard Tzong-Han, Dai Hong-Jie, Liu Feng, Chen Yifei, Sun Chengjie, Katrenko Sophia, Adriaans Pieter, Blaschke Christian, Torres Rafael, Neves Mariana, Nakov Preslav, Divoli Anna, Maña-López Manuel, Mata Jacinto, Wilbur W John

Primary Institution: National Center for Biotechnology Information

Hypothesis

Can different systems effectively identify gene name mentions in scientific text?

Conclusion

The highest achieved F1 score for gene mention recognition was 0.8721, with a combined score of 0.9066 possible by merging results from all submissions.

Supporting Evidence

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop.
The highest achieved F1 score was 0.8721.
Combining results from all submissions yielded an F score of 0.9066.
Statistical analysis showed significant differences in performance among the top teams.
Performance measures included precision, recall, and F score.
Bootstrap resampling was used to estimate statistical significance.
Each team was allowed to submit multiple runs for evaluation.
Annotation guidelines for gene mentions were found to be complex and influenced performance.

Takeaway

This study shows that many teams can create systems to find gene names in text, and when combined, these systems can do even better.

Methodology

Participants designed systems to identify gene name mentions in sentences using various methods and evaluated their performance based on precision, recall, and F1 score.

Potential Biases

The results may not generalize to other datasets due to the specific nature of the training and testing corpora.

Limitations

The absolute effectiveness measures are not meaningful outside the context of the challenge.

Participant Demographics

Nineteen teams participated in the challenge, each submitting multiple runs.

Statistical Information

P-Value

0.0123

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/gb-2008-9-s2-s2

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication

Home

Previous Next