Description

Title Named Entity Recognition for Cultural Heritage
Abstract Named entities are the basic building blocks for any text understanding task. In recent years, data-driven, or statistical named entity recognition (NER) approaches have risen to the challenge of automatically recognising named entities in text. However, most NER work has been focused on the newswire domain, leaving many domains still in need of suitable NER tools to unlock information in their texts. Domain adaptation is therefore an important task, but it also provides some big challenges to current NER approaches. I will present a study into NER for the cultural heritage domain in of an in-depth comparison of two state-of-the-art statistical approaches. One approach employs advanced features describing domain knowledge, the second achieves domain adaptation through retraining with domain data. Although the re- sults with the domain-informed approach are promising, our comparison shows that currently using annotated training data is still the most promising form of domain adaptation.

Other presentations by Marieke van Erp

DateTitle
19 October 2009 Accessing Natural History
29 November 2010 Historical Event Extraction
23 January 2012 The Essence of "Dutchness"
26 November 2012 Named Entity Recognition for Cultural Heritage