English-Polish and Polish-Russian Translation Systems

Szymon Roziewski, Marek Kozłowski, Łukasz Podlodowski

2019 W: Proceedings of the PolEval 2019 Workshop / Maciej Ogrodniczuk, Łukasz Kobyliński; Warszawa: Instytut Podstaw Informatyki Polskiej Akademii Nauk, s. 63-72

This work presents our results of participation in Entity Linking task at PolEval 2019. The goal of the task was to identify the meaning of entities from a knowledge base in Polish Wikipedia texts. The data contain texts from Polish Wikipedia, given in a structured form. Each data entity consists of specific information, regarding word entity itself, its exact part of speech, and for text mentions, the Wikipedia link, and Wikidata id, in addition. We have used a hybrid approach for solving this task. The main idea was to filter out entities that suit for simple mapping, and the rest which was “hiding” behind different context or textual form, were directed to another model. After mention candidates were found, we proceeded with semantic filtering of them, with respect to the entity context. This procedure was performed by using word2vec model, trained on a train set.