Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge
2008

Evaluation of Text-Mining Systems for Biology

Sample size: 44 publication Evidence: moderate

Author Information

Author(s): Martin Krallinger, Alexander Morgan, Larry Smith, Florian Leitner, Lorraine Tanabe, John Wilbur, Lynette Hirschman, Alfonso Valencia

Primary Institution: Spanish National Cancer Research Centre (CNIO)

Hypothesis

How effective are text-mining systems in extracting biological information from literature?

Conclusion

The Second BioCreative assessment showed significant improvements in text-mining performance and increased participation compared to the first assessment.

Supporting Evidence

  • The assessment attracted 44 teams, indicating strong interest in text-mining technologies.
  • The best submissions showed improvements in precision and recall for gene mention and normalization tasks.
  • A meta-server for text-mining was developed to integrate results from various systems.

Takeaway

This study looked at how well computer programs can find important information in biology papers, and it found that they are getting better at it.

Methodology

The study involved a community challenge where 44 teams evaluated text-mining systems on tasks related to gene mentions, gene normalization, and protein-protein interactions.

Limitations

Different tasks were not aligned in terms of the data collections used.

Participant Demographics

44 teams from 13 countries participated.

Statistical Information

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/gb-2008-9-s2-s1

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication