Evaluation of Gene Mention Finding Methods
Author Information
Author(s): Yeh Alexander, Morgan Alexander, Colosimo Marc, Hirschman Lynette
Primary Institution: The MITRE Corporation
Hypothesis
How effective are different text mining systems at identifying gene mentions in biological literature?
Conclusion
While many teams achieved over 80% F-measure, the results still lag behind those in other domains due to the complexity of gene names.
Supporting Evidence
- 15 teams participated in the evaluation, with many achieving scores over 80% F-measure.
- The results indicate that while performance is improving, challenges remain due to the complexity of gene names.
- Statistical tests showed that some differences in scores were significant, indicating varying performance among teams.
Takeaway
This study looked at how well different computer programs can find names of genes in scientific papers, and many did a good job, but it's still tricky because gene names can be complicated.
Methodology
The evaluation involved 15 teams using various text mining systems to identify gene mentions in annotated sentences from Medline abstracts.
Potential Biases
Teams classified their submissions as open or closed, which may have led to inconsistencies in evaluation.
Limitations
The complexity of gene names and the lack of standardization in evaluation methods made comparisons difficult.
Participant Demographics
Participants included 15 teams from various institutions, but specific demographics were not detailed.
Statistical Information
P-Value
4.9%
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website