Combining neural and knowledge-based approaches to Named Entity Recognition in Polish

Sławomir Dadas

2019 W: Artificial Intelligence and Soft Computing 18th International Conference, ICAISC 2019, Zakopane, Poland, June 16–20, 2019, Proceedings, Part I / Leszek Rutkowski, Rafał Scherer, Marcin Korytkowski, Witold Pedrycz, Ryszard Tadeusiewicz, Jacek M. Zurada; Cham: Springer, s. 39-50

18th International Conference on Artificial Intelligence and Soft Computing. Zakopane, 2019-06-16 - 2019-06-20

Named entity recognition (NER) is one of the tasks in natural language processing that can greatly benefit from the use of external knowledge sources. We propose a named entity recognition framework composed of knowledge-based feature extractors and a deep learning model including contextual word embeddings, long short-term memory (LSTM) layers and conditional random fields (CRF) inference layer. We use an entity linking module to integrate our system with Wikipedia. The combination of effective neural architecture and external resources allows us to obtain state-of-the-art results on recognition of Polish proper names. We evaluate our model on data from PolEval 2018 NER challenge on which it outperforms other methods, reducing the error rate by 22.4% compared to the winning solution. Our work shows that combining neural NER model and entity linking model with a knowledge base is more effective in recognizing named entities than using NER model alone.