Data&Musée

Explorer les données de l'héritage culturel français

Remarks on the temporal dimension in knowledge graphs

I am talking here about knowledge bases based on the RDF representation. A question that often comes up is the expression of a temporal qualification that applies to a fact (expressed by an RDF triple).

For example, the following fact, expressed in a non-rigorous way:

(Pablo Picasso, married to, Olga Khokhlova)

This fact is obviously true only from a certain date. Picasso was not married at birth. So we have to introduce a 'fact' about the previous fact:

((Pablo Picasso, married to, Olga Khokhlova), from, 1918)

This can be expressed with the reification technique (explanations). It can be expressed with the extensions introduced by RDF* (RDF star) and exploitable with SPARL*.

There are also other ways to introduce a temporal dimension in triples.

For example, the Wikidata class 'country' has a subclass 'historical country'. This allows to introduce a precision without using reification. For example, instead of using

(USSR, is an instance of, country)

which can be considered as currently false or at least imprecise, we will introduce

(USSR, is an instance of, historical country)

During the production of the article "Rule Mining for Semantifying Wikilinks" with Luis Galaragga, we generated triples by a rule search method applying on a set of triples. The rate of true triples generated was good. But an observation of the erroneous triples showed recurrent series of errors based on the same pattern. For example:

(soccer player xxx, is a member of, team yyy)

which was false at the time it was produced, whereas the set of triples on this model could be replaced by triples exploiting a property introducing a temporal dimension that extends the scope of the initial property:

(soccer player xxx, is or was a member of, team yyy)

Triplets that were true with the initial property remain true, while many triplets that were false with the initial property become true.

This suggests to me that properties like 'is a member of' that have an implicit temporal dimension should be systematically replaced by an extended property like 'is or has been a member of' and, possibly, accompanied by clarifications through reification.

Author: Moissinac

Maitre de conférence à Télécom Paris, Département Image, Données, Signal - Groupe Multimédia Jean-Claude Moissinac a mené des recherches sur les techniques avancées pour la production, le transport, la représentation et l’utilisation des documents multimédia. Ces travaux d'abord ont évolué vers la représentation sémantique de données liées au multimédia (process de traitement de médias, description d'adaptations de média, description formelle d'interactions utilisateurs). Aujourd'hui, les travaux portent sur la constitution de graphes de connaissances. Principaux axes de recherche actuel : représentations sémantiques de connaissances, constitution de graphes de connaissances, techniques d'apprentissage automatique sur ces graphes

Comments are closed.