Accelerating the annotation of sparse named entities by dynamic sentence selection
2008

Improving Named Entity Annotation with Less Human Effort

Sample size: 14041 publication Evidence: high

Author Information

Author(s): Tsuruoka Yoshimasa, Tsujii Jun'ichi, Ananiadou Sophia

Primary Institution: The University of Manchester

Hypothesis

Can a new framework reduce the human effort required for named entity annotation in sparse datasets?

Conclusion

The proposed framework significantly reduces the number of sentences needing human annotation while maintaining high coverage of named entities.

Supporting Evidence

  • The framework can reduce the number of sentences needing to be examined by human annotators.
  • Cost reduction can be drastic when target named entities are sparse.
  • The framework allows for unbiased named entity annotations.

Takeaway

This study shows a way to make it easier for people to label important names in texts by using smart computer help, so they don't have to read everything.

Methodology

The framework uses an iterative process between a human annotator and a probabilistic named entity tagger to select sentences likely to contain target named entities.

Potential Biases

The framework aims to avoid sampling bias present in traditional active learning approaches.

Limitations

The framework is less effective when target named entities are not sparse, as the cost savings are limited by the proportion of relevant sentences.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-S11-S8

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication