PolEval 2019: Entity Linking

Szymon Roziewski, Marek Kozłowski, Łukasz Podlodowski

2019 W: Human Interaction and Emerging Technologies : Proceedings of the 1st International Conference on Human Interaction and Emerging Technologies (IHIET 2019), August 22-24, 2019, Nice, France / Tareq Ahram, Serge Colson, Redha Taiar ; Cham: Springer, s. 310-316

This work presents our results of participation in Entity Linking task at PolEval 2019. The
goal of the task was to identify the meaning of entities from a knowledge base in Polish
Wikipedia texts. The data contain texts from Polish Wikipedia, given in a structured form.
Each data entity consists of specific information, regarding word entity itself, its exact part of
speech, and for text mentions, the Wikipedia link, and Wikidata id, in addition. We have used
a hybrid approach for solving this task. The main idea was to filter out entities that suit for
simple mapping, and the rest which was “hiding” behind different context or textual form,
were directed to another model. After mention candidates were found, we proceeded with
semantic filtering of them, with respect to the entity context. This procedure was performed
by using word2vec model, trained on a train set.