Automating curation using a natural language processing pipeline
2008

Automating Curation with Natural Language Processing

publication Evidence: moderate

Author Information

Author(s): Alex Beatrice, Grover Claire, Haddow Barry, Kabadjov Mijail, Klein Ewan, Matthews Michael, Tobin Richard, Wang Xinglong

Primary Institution: School of Informatics, University of Edinburgh

Hypothesis

Can a natural language processing pipeline effectively assist in curating biomedical literature?

Conclusion

The developed technologies can be adapted for various tasks in biomedical curation, although some complex tasks remain challenging.

Supporting Evidence

  • Our system performed well on gene mention tasks with minimal development effort.
  • The pipeline was adapted to recognize a wider range of named entities.
  • High performance was achieved on individual tasks, but complex tasks remain challenging.

Takeaway

This study shows how computers can help scientists organize and understand a lot of research papers quickly, but some tasks are still too hard for them.

Methodology

The study utilized a natural language processing pipeline to extract named entities and relations from biomedical texts.

Limitations

The system struggles with tasks that require combining multiple components, such as detecting and normalizing interacting protein pairs.

Digital Object Identifier (DOI)

10.1186/gb-2008-9-s2-s10

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication