Automatic information extraction from biological literature
2001

Extracting Information Automatically from Biological Literature

publication

Author Information

Author(s): Christian Blaschke, Robert Hoffmann, Juan Carlos Oliveros, Alfonso Valencia

Primary Institution: Protein Design Group, CNB-CSIC, Madrid, Spain

Conclusion

The study discusses various methods for automatically extracting information from biological literature to aid in understanding genomics and proteomics data.

Supporting Evidence

  • The study highlights the need for linking biological databases with literature information.
  • Three main types of systems for information extraction are discussed: statistical methods, computational linguistics methods, and frame-based approaches.
  • Geisha and Suiseki are two systems evaluated for their effectiveness in extracting biological information.

Takeaway

Scientists are trying to find ways to automatically pull useful information from a lot of biology papers to help understand genes and proteins better.

Methodology

The study reviews statistical methods, computational linguistics methods, and frame-based approaches for extracting information from biological texts.

Potential Biases

There is a risk of bias in the evaluation of systems due to the reliance on known interactions that may not be present in the literature.

Limitations

The adaptation of computational linguistics methods to molecular biology is not guaranteed to be successful.

Digital Object Identifier (DOI)

10.1002/cfg.102

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication