ChemicalTagger: A tool for semantic text-mining in chemistry
2011

ChemicalTagger: A Tool for Semantic Text-Mining in Chemistry

Sample size: 10000 publication 10 minutes Evidence: high

Author Information

Author(s): Hawizy Lezan, Jessop David M, Adams Nico, Murray-Rust Peter

Primary Institution: Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge

Hypothesis

Can structured scientific data be extracted from unstructured scientific literature using text-mining techniques?

Conclusion

ChemicalTagger can effectively parse chemical experimental text and has been successfully deployed to identify solvents with over 99.5% precision.

Supporting Evidence

  • ChemicalTagger achieved machine-annotator agreements of 88.9% for phrase recognition.
  • The tool has been deployed for over 10,000 patents.
  • It identified solvents from their linguistic context with >99.5% precision.

Takeaway

ChemicalTagger is a computer program that helps scientists find and organize information from chemistry papers, making it easier to understand and use.

Methodology

ChemicalTagger uses a modular architecture combining NLP techniques to tag and parse chemical texts.

Potential Biases

Potential biases in the training data could affect the accuracy of the extraction.

Limitations

The extraction tools may not be perfect and depend on the quality of the input text.

Participant Demographics

The study involved trained chemists with formal backgrounds in different areas of chemistry.

Statistical Information

P-Value

0.5

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1758-2946-3-17

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication