Description

Title Context-Aware Constraint Mining for Knowledge Graphs
Abstract Automatic conversion from existing databases has replaced manual engineering as the norm for knowledge graph creation. This shift comes at the price of data quality, which is often reduced by either the copying of existing artefacts or by the introduction of new ones caused by improper conversion. Detecting these artefacts in an automatic way is difficult, and generally follows a top-down approach by validating data points against a provided set of constraints (e.g., business rules). In this research, we introduce a bottom-up approach that generates constraints directly from the data themselves. Specifically tailored to knowledge graphs, these constraints take contexts (i.e., subgraphs) into account, exploit common semantics (RDF/RDFS), and incorporate prior knowledge (e.g., schemas). Once generated, we can check any knowledge graph using an arbitrary SHACL validator. Experiments were held in the asset management domain, and involved the generation of constraints and the validation of data using these constraints. The results were evaluated by 1) comparison with a gold standard, and 2) by assessing the method's usefulness within a focus group.