Description

Title Network Metrics for Assessing the Quality of Entity Links Between Multiple Datasets
Abstract Linking entities between datasets is a crucial step in data-integration in general, and in the use of multiple datasets on the semantic web in particular. A rich literature exists on different approaches to the entity linking problem, and a fair amount of tools is available for practical use. However, much less work has been done on how to assess the quality of such entity links once they have been generated by any of these tools. Evaluation methods for link quality are typically limited to either comparison with a ground truth (which is often not at one's disposal), manual work (which is cumbersome and prone to error), or crowd sourcing (which is not always feasible, especially if background information is required). Furthermore, the problem of link evaluation is greatly exacerbated for links between more than two datasets, because the number of possible links grows rapidly with the number of datasets. In this paper we propose a method to estimate the quality of such entity links between multiple datasets. We exploit the fact that the links between entities from multiple datasets form a network, and we show how simple metrics on this network of entity-links can reliably predict the quality of these links. We verify our results in a large experimental study using six datasets from the domain of science and innovation studies.

Other presentations by Al Idrissou

DateTitle
03 October 2016
13 March 2017
09 April 2018 Network Metrics for Assessing the Quality of Entity Links Between Multiple Datasets