Description
Title | Provenance In and Outside the Database |
Abstract | Domains such as drug discovery, web science, and policy studies increasing rely on the combination of complex analysis pipelines with integrated data sources to come to conclusions. A key question then arises is what are these conclusions based upon? (i.e. what is their provenance?). In this talk, I describe recent work that is attempting to combine provenance within databases with the data integration and analytics pipelines that feed them. In particular, how can we mix the concepts of dataset description, provenance polynomials and Web-based provenance models. I discuss this with respect to large scale drug discovery platform, Open PHACTS (http://www.openphacts.org), that combines tens of databases with billions of facts. |
Other presentations by Paul Groth
Date | Title |
---|---|
12 October 2009 | I want to be a Data DJ! |
18 January 2010 | Is trust just machine learning? - a question from work on content based trust in electronic contracts |
18 October 2010 | Data DJ |
20 June 2011 | Open PHACTS: Taking on pharmaceutical data with the kitchen sink |
15 October 2012 | Overview and Status of the W3C Provenance Recommendations |
04 March 2013 | Oh, Yeah? Abductive reasoning and network representations for reconstructing data provenance |
04 November 2013 | Provenance In and Outside the Database |
31 March 2014 | The Open PHACTS project after 3 years |
17 November 2014 | Can provenance actually help speed query performance? |
16 November 2015 | Science as a service: From the lab to the cafe |