Abstract |
This talk will be about the work I did in the Mining for Information in Texts from the Cultural Heritage (MITCH) project prior to joining the VU. The MITCH project was a collaboration between Tilburg University and the Dutch National Museum for Natural History Naturalis. The aim of MITCH was to improve the accessibility of the textual data that accompanies each specimen in the Naturalis collection. In order to do this (semi-)automatic approaches were developed to turn an analogue data sources from the natural history domain into a high quality, enriched and easily accessible digital resource.
I will highlight four key aspects of my work in MITCH, namely automatic database population, data cleaning, data structuring and data retrieval. |