- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Wed, 29 Feb 2012 16:21:15 +0100
- To: RDF WG <public-rdf-wg@w3.org>
*Beware*: this is about design solutions using the dataset proposal as a whole. It is not strictly related to the semantics. It explains concretely how one could store things in a dataset, possibly entail new things according the dataset semantics of [2] and so on, such that eventually it addresses the use case. So it contains a number of things that applications should do to address the UCs, independently of the truth values of triples or "named" graphs. UC 1.3: Graph Changes Over Time This use case is much more complicated to deal with than UC 1.5 and 5.2. First, there are different notions of temporal change: (1) it can refer to changes of conceptualisation, i.e., the graph changed because we realised it wasn't modelling reality accurately. This is related to version; (2) changes because the truth of statements changed (e.g., Joe works for XYZ in 2004, but he quits in 2005 and now works for ABC). This is validity time. Then, there is the problem of what kind of temporal information are represented. Temporal information can be as simple as a xsd:dateTime value. But it can be a time interval. It could be a recurring time interval. It could be an arbitrary set (possibly infinite) of time points. It could be a variable constrained by temporal relationships like Allen's algebra. So I'll start with the simpler cases. I'll give a design solution for RDF graphs that are valid at a single time point (they can be valid at other time points---open world assumtion---but the design specify validy only at a finite set of time points). Let us assume that we have a company's data, with people employment. The dataset stores information like ":joe :worksfor :company" or ":joe :leads :teamA" etc. These fact are evolving in time. Whenever a triple is obsolete (e.g., Joe is not a team leader any longer), create a new "named" graph where the new information is provided. The question is "what name should be used for the graph?" There are different solutions: 1) mint a new IRI, distinct from all IRIs appearing inside the graphs at each time point; 2) use a literal of type xsd:dateTime or xsd:dateTimeStamp. Solution (2) is simpler but not in agreement with the definition of SPARQL datasets, which imposes that graph "names" are IRIs. However, it is clear and unambiguous what the graph "names" denote. Then, whatever is true at time t1 does not influence what is true at time t2 since the truth of the statement may have changed. Solution (1) requires that additional information is given, as the graph IRI is supposed to be opaque. Moreover, the semantics does not allow anyone to assume that the IRI is denoting anything in particular. So the idea would be to add some meta information about the dataset, which makes it clear how to understand it. However, even in absence of the metainformation, the inferences provided by the semantics of [2] are inline with what to expect in a temporally scoped representation: anything true in graph labelled by X does not need to be true in graph labelled by Y. In absence of metainformation, a system that parse and reason with the dataset would not understand that statements are tied to a time point, but they at least would not allow inferences of one graph to influence knowledge in a different graph. Metainformation could be provided as a separate file (together with voiD annotations). We would need a vocabulary to say that the dataset is built according to a certain IRI scheme, where each graph "name" denote the graph itself and is tied to a certain time point. Something like: <> a void:Dataset ; ex:semantics ex:GraphNamesDenoteGraph . :g1 a rdf:Graph ; ex:validAt "2011-10-08T10:23:42"^^xsd:dateTime . :g2 a rdf:Graph ; ex:validAt "...."^^xsd:dateTime . and the dataset itself contains: :g1 { :bob :worksfor :company1 } :g2 { :bob :worksfor :company2 } ... When a dataset processor meets the statement: <> ex:semantics ex:GraphNamesDenoteGraph . it would know that the following statements are meant to say something about the graphs themselves, which can be stating as additional semantic constraints. This may be sufficient when one simply want to query what is true at a given time point, or just to have a kind of wayback machine for RDF. But it's certainly not satisfying for a lot of use cases. More in future emails. -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 83 36 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Wednesday, 29 February 2012 15:21:45 UTC