Kategorie

A Bidirectional Iterative Algorithm for Nested Named Entity Recognition

Sławomir Dadas, Jarosław Protasiewicz

2020 IEEE Access, T. 8, s. 135091 - 135102

Nested named entity recognition (NER) is a special case of structured prediction in which annotated sequences can be contained inside each other. It is a challenging and significant problem in natural language processing. In this paper, we propose a novel framework for nested named entity recognition tasks. Our approach is based on a deep learning model which can be called in an iterative way, expanding the set of predicted entity mentions with each subsequent iteration. The proposed framework combines two such models trained to identify named entities in different directions: from general to specific ( outside-in ), and from specific to general ( inside-out ). The predictions of both models are then aggregated by a selection policy. We propose and evaluate several selection policies which can be used with our algorithm. Our method does not impose any restrictions on the length of entity mentions, number of entity classes, depth, or structure of the predicted output. The framework has been validated experimentally on four well-known nested named entity recognition datasets: GENIA, NNE, PolEval, and GermEval. The datasets differ in terms of domain (biomedical, news, mixed), language (English, Polish, German), and the structure of nesting (simple, complex). Through extensive tests, we prove that the approach we have proposed outperforms existing methods for nested named entity recognition.

http://dx.doi.org/10.1109/access.2020.3011598