Description

Title A Growing Resource of Heterogeneous Provenance-Centric Linked Data
Abstract Nanopublications are a Linked Data format that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level, relying on RDF with named graphs. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling, named RDF graphs, and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some general analyses, and describe how it can be accessed and queried.