RE: Representing temporal data in RDF from Geoff Chappell on 2003-09-09 (www-rdf-interest@w3.org from September 2003)

From: Geoff Chappell <geoff@sover.net>
Date: Tue, 9 Sep 2003 14:26:37 -0400
To: <www-rdf-interest@w3.org>
Message-ID: <00ca01c376ff$e892f8f0$0100a8c0@gsclaptop>
In RDF Gateway's query language[1] your query would look something like:

SELECT ?f USING sometable WHERE {[type] ?f [Freighter]}
	AND {?c [location] ?f [antwerp]}
	AND {?c [hasCargo] ?f ?cargo}
	AND {?c [consistsOf] ?cargo [AluminumPipe]}
	AND {[beginDate] ?c ?begin}
	AND {[endDate] ?c ?end}
	AND DateDiff('s', DATE(?begin), DATE("2003-05-01")) > 0 
	AND DateDiff('s', DATE(?end), DATE("2003-03-31")) < 0

We support quads for the reasons that you list in your email. If no
context/fourth member is specified, the triple has a null context. 

Another handy use for the context.... We support permissioning of
triples based upon the context. E.g. if a user has read permissions to
the [ex:A] context but not the [ex:B] and this data exists:

	{[ex:A] [color] [Fido] [Brown]}
	{[ex:B] [color] [Fido] [Black]}

and the following query is executed by that user:

	SELECT ?c USING mydata WHERE {[color] [Fido] ?c}

Results will be:

?c
=======
[Brown]


rgds,

Geoff Chappell

[1] http://www.intellidimension.com



> -----Original Message-----
> From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest- 
> request@w3.org] On Behalf Of Bob MacGregor
> Sent: Tuesday, September 09, 2003 1:42 PM
> To: www-rdf-interest@w3.org
> Subject: Representing temporal data in RDF
> 
> 
> 
> I have yet to come across a system having a substantial user base that

> does an adequate job of representing temporally situated data.  If you

> think about it, most facts about the world are either true only within

> a specific temporal interval, or they are things like events that have

> a time component built in (mathematical facts are an exception).  In 
> other words, when it comes to representing day-to-day facts, there is 
> a large opportunity waiting.  Ideally, the Semantic Web tools could 
> fill this void.  Unfortunately, the tools that RDF provides us are 
> almost unbearably clumsy.  Below is an example.
> 
> 
> Here is my example query:
> 
> "Retrieve freighters that visited Antwerp on
> April 2003 whose cargo included aluminum pipes"
> 
> 
> The best solution that I have come across for representing this query 
> uses quads.  A "quad" is a 4-tuple <?c ?s ?p ?o> with roles context, 
> subject, predicate, object, i.e., its a triple with an extra context 
> field.
> 
> Here is the query in a RDQL variant that supports a quad syntax 
> instead of a triple syntax, with namespaces omitted:
> 
> SELECT ?f
> WHERE  ((null ?f type Freighter),
> 	(?c ?f location antwerp),
> 	(?c ?f hasCargo ?cargo),
> 	(?c ?cargo consistsOf AluminumPipe),
> 	(null ?c beginDate ?begin),
> 	(null ?c endDate ?end),
> 	(?begin before "May 1 2003"),
> 	(?end after "March 31 2003"))
> 
> This is actually quite a reasonable query.  Its fairly concise, and 
> fairly readable.  Unfortunately, I'm not aware of any system that 
> implements quads and has a significant user base.  For the moment, 
> quads are still a wish that has not come true.
> 
> Now, lets consider expressing this same query using only triples.  RDF

> does not provide any officially-sanctioned way to do this, so we have 
> to improvise.  RDF provides the notion of a reified statement, but 
> there is more than one way to use reified statements.
> 
> One approach attaches dates and other metadata directly to reified 
> statements.  Anyone who has experimented with this long enough will 
> realize that this approach is a loser.
> 
> A second approach attaches metadata (like dates) to a "collection of 
> statements", where the collection might be an RDF bag or list.  This 
> is a big improvement over the previous approach, but we can do better.
> 
> The simplest approach invents a context object, and points a context 
> to the reified statements within it (or points the statements to the 
> context).  This approach is isomorphic to the second approach, but is 
> slightly cleaner and probably more efficient.
> 
> Here we have rewritten the first query using only triples, and 
> contexts that include reified statements:
> 
> SELECT ?f
> WHERE  ((?f type Freighter),
> 	(?st1 type Statement),
> 	(?st1 subject ?f),
> 	(?st1 predicate location),
> 	(?st1 object antwerp),
> 	(?st2 type Statement),
> 	(?st2 subject ?f),
> 	(?st2 predicate hasCargo),
> 	(?st2 object ?cargo),
> 	(?st3 type Statement),
> 	(?st3 subject ?cargo),
> 	(?st3 predicate consistsOf),
> 	(?st3 object AluminumPipe),
> 	(?st1 inContext ?c),
> 	(?st2 inContext ?c),
> 	(?st3 inContext ?c),
> 	(?c beginDate ?begin),
> 	(?c endDate ?end),
> 	(?begin before "May 1 2003"),
> 	(?end after "March 31 2003"))
> 
> 
> This gets the job done, but its really quite awful.  Not only is it 
> much less readable, but it is MUCH less efficient than the quad 
> representation.  Why is that?  First of all, the number of "joins" is 
> much larger.  Second, and probably more damaging, the optimizer now 
> has to optimize over predicates like "subject", "predicate" and 
> "object" that mix together extensions of many different predicates.  A

> database consisting mostly of temporally-situated data will have to 
> reify the majority of its data, so these three predicates will likely 
> contain most of the data in the database. Not a pretty sight.
> 
> In fact, it IS possible to stick to triples instead of quads, and 
> still produce a practical means for representing temporally situated 
> data.  I'm looking for examples of other RDF-based systems that have 
> successfully solved this problem.  Any takers (in addition to a 
> reference, I would like to see how my query would be represented in 
> your system)?
> 
> Cheers, Bob
Received on Tuesday, 9 September 2003 14:26:43 UTC