WAI meetings

Description

Title

Scalable and High Quality Relation Extraction with CrowdTruth

Abstract

The lack of annotated datasets for training and benchmarking is one of the main challenges of Clinical Natural Language Processing. In addition, current methods for collecting annotation attempt to minimize disagreement between annotators, and therefore fail to model the ambiguity inherent in language. In this presentation, I will discuss the CrowdTruth method for collecting medical ground truth through crowdsourcing, based on the observation that disagreement between annotators can be used to capture ambiguity in text. I will present the results of an experiment in training a classification model for relation extraction. Our findings show that the crowd can perform at least as well as medical experts when training over 2 difficult relations (treats and cause), as well as out-performing automated relation extraction with distant supervision. Finally, I will discuss preliminary work in expanding this experiment for open domain relation extraction.

Other presentations by Anca Dumitrache

Date	Title
26 January 2015	Crowdsourcing Ground Truth for Relation Extraction in the Medical Domain
29 February 2016	Scalable and High Quality Relation Extraction with CrowdTruth
31 October 2016	Crowdsourcing for Distant Supervision with Active Learning
22 May 2017	Crowdsourcing Ambiguity-Aware Ground Truth - a Cross-Task Evaluation
05 March 2018	Capturing Ambiguity in Crowdsourcing Frame Disambiguation

WAI schedule

Description

Other presentations by Anca Dumitrache