- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Wed, 29 Feb 2012 16:21:15 +0100
- To: RDF WG <public-rdf-wg@w3.org>
*Beware*: this is about design solutions using the dataset proposal as a
whole. It is not strictly related to the semantics. It explains
concretely how one could store things in a dataset, possibly entail new
things according the dataset semantics of [2] and so on, such that
eventually it addresses the use case. So it contains a number of things
that applications should do to address the UCs, independently of the
truth values of triples or "named" graphs.
UC 1.3: Graph Changes Over Time
This use case is much more complicated to deal with than UC 1.5 and 5.2.
First, there are different notions of temporal change:
(1) it can refer to changes of conceptualisation, i.e., the graph
changed because we realised it wasn't modelling reality accurately. This
is related to version;
(2) changes because the truth of statements changed (e.g., Joe works
for XYZ in 2004, but he quits in 2005 and now works for ABC). This is
validity time.
Then, there is the problem of what kind of temporal information are
represented. Temporal information can be as simple as a xsd:dateTime
value. But it can be a time interval. It could be a recurring time
interval. It could be an arbitrary set (possibly infinite) of time
points. It could be a variable constrained by temporal relationships
like Allen's algebra. So I'll start with the simpler cases.
I'll give a design solution for RDF graphs that are valid at a single
time point (they can be valid at other time points---open world
assumtion---but the design specify validy only at a finite set of time
points). Let us assume that we have a company's data, with people
employment. The dataset stores information like ":joe :worksfor
:company" or ":joe :leads :teamA" etc. These fact are evolving in time.
Whenever a triple is obsolete (e.g., Joe is not a team leader any
longer), create a new "named" graph where the new information is
provided. The question is "what name should be used for the graph?"
There are different solutions:
1) mint a new IRI, distinct from all IRIs appearing inside the graphs
at each time point;
2) use a literal of type xsd:dateTime or xsd:dateTimeStamp.
Solution (2) is simpler but not in agreement with the definition of
SPARQL datasets, which imposes that graph "names" are IRIs. However, it
is clear and unambiguous what the graph "names" denote. Then, whatever
is true at time t1 does not influence what is true at time t2 since the
truth of the statement may have changed.
Solution (1) requires that additional information is given, as the graph
IRI is supposed to be opaque. Moreover, the semantics does not allow
anyone to assume that the IRI is denoting anything in particular.
So the idea would be to add some meta information about the dataset,
which makes it clear how to understand it. However, even in absence of
the metainformation, the inferences provided by the semantics of [2] are
inline with what to expect in a temporally scoped representation:
anything true in graph labelled by X does not need to be true in graph
labelled by Y.
In absence of metainformation, a system that parse and reason with the
dataset would not understand that statements are tied to a time point,
but they at least would not allow inferences of one graph to influence
knowledge in a different graph.
Metainformation could be provided as a separate file (together with voiD
annotations). We would need a vocabulary to say that the dataset is
built according to a certain IRI scheme, where each graph "name" denote
the graph itself and is tied to a certain time point.
Something like:
<> a void:Dataset ;
ex:semantics ex:GraphNamesDenoteGraph .
:g1 a rdf:Graph ;
ex:validAt "2011-10-08T10:23:42"^^xsd:dateTime .
:g2 a rdf:Graph ;
ex:validAt "...."^^xsd:dateTime .
and the dataset itself contains:
:g1 { :bob :worksfor :company1 }
:g2 { :bob :worksfor :company2 }
...
When a dataset processor meets the statement:
<> ex:semantics ex:GraphNamesDenoteGraph .
it would know that the following statements are meant to say something
about the graphs themselves, which can be stating as additional semantic
constraints.
This may be sufficient when one simply want to query what is true at a
given time point, or just to have a kind of wayback machine for RDF. But
it's certainly not satisfying for a lot of use cases.
More in future emails.
--
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Wednesday, 29 February 2012 15:21:45 UTC