Abstract |
In modern machine learning,raw data is the preferred input for our models. Where
a decade ago data scientists were still engineering features, manually picking
out the details we thought salient, they now prefer the data in their raw form.
As long as we can assume that all relevant and irrelevant information is present
in the input data, we can design deep models that build up intermediate
representations to sift out relevant features. However, these models are often
domain specific and tailored to the task at hand, and therefore unsuited for
learning on heterogeneous knowledge: information of different types and from
different domains. If we can develop methods that operate on this form of
knowledge, we can dispense with a great deal more ad-hoc feature engineering and
train deep models end-to-end in many more domains. To accomplish this, we first
need a data model capable of expressing heterogeneous knowledge naturally in
various domains, in as usable a form as possible, and satisfying as many use
cases as possible. We argue that the knowledge graph is a suitable candidate for
this data model. We further discuss some of the promises and challenges of this
approach, and how we are currently broadening our efforts to multi-modal knowledge
graphs. |