Improving Named Entity Annotation with Less Human Effort
Author Information
Author(s): Tsuruoka Yoshimasa, Tsujii Jun'ichi, Ananiadou Sophia
Primary Institution: The University of Manchester
Hypothesis
Can a new framework reduce the human effort required for named entity annotation in sparse datasets?
Conclusion
The proposed framework significantly reduces the number of sentences needing human annotation while maintaining high coverage of named entities.
Supporting Evidence
- The framework can reduce the number of sentences needing to be examined by human annotators.
- Cost reduction can be drastic when target named entities are sparse.
- The framework allows for unbiased named entity annotations.
Takeaway
This study shows a way to make it easier for people to label important names in texts by using smart computer help, so they don't have to read everything.
Methodology
The framework uses an iterative process between a human annotator and a probabilistic named entity tagger to select sentences likely to contain target named entities.
Potential Biases
The framework aims to avoid sampling bias present in traditional active learning approaches.
Limitations
The framework is less effective when target named entities are not sparse, as the cost savings are limited by the proportion of relevant sentences.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website