Named Entity Disambiguation over Texts Written in the Portuguese or Spanish Languages

This article addresses the problem of disambiguating named entities, in text documents, towards entries in a knowledge base like Wikipedia. The proposed approach uses supervised learning to sort candidate knowledge base entries for each entity mentioned in a text, and then to classify the entry ranked in the first position as either the correct disambiguation or not.

We present results with Portuguese and Spanish texts for a wide range of models and configuration options. Our experiments attest to the effectiveness of supervised learning methods in this specific task, showing that out-of-the-box algorithms and relatively simple features can achieve a high accuracy.

Share This Post