RE: Representing temporal data in RDF from Leo Sauermann on 2003-09-10 (www-rdf-interest@w3.org from September 2003)

From: Leo Sauermann <leo@gnowsis.com>
Date: Wed, 10 Sep 2003 11:32:42 +0200
To: "'Bob MacGregor'" <macgregor@ISI.EDU>, <www-rdf-interest@w3.org>
Message-ID: <000c01c3777e$7ca5c850$0501a8c0@ZION>
quads are good for context, like Geoff Chappell did in his RDF Gateway,
a tool I can recommend.

But either you take Richards or Geoff's tool does not change the fact
that you can model the data better. This may not answer your original
question but be more the rdf way:

SEARCH FOR SHIPMENT
===================

You are right that there is temporal data is related to a certain time.
But I think you can define your problem far better:

In RDF, everything has to have an identifier. You are not searching for
a FREIGHTER, you have to search for a SHIPMENT.
The shipment is the actual idea of bringing things with a ship to a
harbor. Most RDF tools support a datetime, so you can use < > = in a
query.

> SELECT ?shipID
> FROM  ((?s rdf:type ship:Shipment),
 	(?s ship:Vessel ?shipID)
> 	(?s ship:destination gps:antwerp),
> 	(?s ship:hasCargo ?cargo),
> 	(?cargo ship:hasItem ?x),
	(?x rdf:type wordNet:aluminiumPipe),
> 	(?s ship:Arrival ?dateA),
> 	(?s ship:Depart ?dateB),
WHERE (?dateA > april2003 and ?dateA < mai2003) or ((?dateB > april2003)
and (dateB < may2003)

(this is some pseudo -RDQL syntax as used by Jena or Sesame)

I assumed some namespaces, to make the example more realistic. In RDF
words are good but uris are far better. A standard organization for
trade may define a URI for an aluminiumPipe and also a URI for all trade
harbors worldwide.
- there is no thing like an "aluminiumPipe", there is
wordNet:AluminiumPipe (I don't know if they have it already, but they
might :o)
- there is no "antwerp" there is "gps:antwerp" or "harbor:antwerp" or
similiar.
- "ship:" is some trading ontology that is used for seabound vessels.


In RDF, really everything may have a URI. every pieve of furniture may
have one, every aluminium pipe of your freight, etc. there is plenty of
space in the uris.
The ship has a uri. The shipment. The order of the merchant, each item
in the order, the merchant as a person, .... everything has a url.


LOOK AT RDF CALENDAR
====================

Have a look at what the RDF Calendar group is doing, Libby Miller and
Dan Connolly:

http://www.w3.org/2002/12/cal/
http://lists.w3.org/Archives/Public/www-rdf-calendar/

they have been chewing on the "duration of an appointment" thing for
quite a while and are experts for temporal stuff. they have solutions.

You should find plenty of test data here. The mailing list archive is
also interesting.

I hope that helps somehow, maybe i missed the question.

greetings
Leo Sauermann
www.gnowsis.com


> -----Original Message-----
> From: www-rdf-interest-request@w3.org 
> [mailto:www-rdf-interest-request@w3.org] On Behalf Of Bob MacGregor
> Sent: Tuesday, September 09, 2003 7:42 PM
> To: www-rdf-interest@w3.org
> Subject: Representing temporal data in RDF
> 
> 
> 
> 
> I have yet to come across a system having a substantial user base that
> does an adequate job of representing temporally situated data.  If you
> think about it, most facts about the world are either true only within
> a specific temporal interval, or they are things like events that have
> a time component built in (mathematical facts are an exception).  In
> other words, when it comes to representing day-to-day facts, there is
> a large opportunity waiting.  Ideally, the Semantic Web tools could
> fill this void.  Unfortunately, the tools that RDF provides us are
> almost unbearably clumsy.  Below is an example.
> 
> 
> Here is my example query:
> 
> "Retrieve freighters that visited Antwerp on
> April 2003 whose cargo included aluminum pipes"
> 
> 
> The best solution that I have come across for representing
> this query uses quads.  A "quad" is a 4-tuple <?c ?s ?p ?o>
> with roles context, subject, predicate, object, i.e., its a
> triple with an extra context field.
> 
> Here is the query in a RDQL variant that supports a quad
> syntax instead of a triple syntax, with namespaces omitted:
> 
> SELECT ?f
> WHERE  ((null ?f type Freighter),
> 	(?c ?f location antwerp),
> 	(?c ?f hasCargo ?cargo),
> 	(?c ?cargo consistsOf AluminumPipe),
> 	(null ?c beginDate ?begin),
> 	(null ?c endDate ?end),
> 	(?begin before "May 1 2003"),
> 	(?end after "March 31 2003"))
> 
> This is actually quite a reasonable query.  Its fairly concise,
> and fairly readable.  Unfortunately, I'm not aware of any system
> that implements quads and has a significant user base.  For the
> moment, quads are still a wish that has not come true.
> 
> Now, lets consider expressing this same query using only triples.  RDF
> does not provide any officially-sanctioned way to do this, so we have
> to improvise.  RDF provides the notion of a reified statement, but
> there is more than one way to use reified statements.
> 
> One approach attaches dates and other metadata directly to reified
> statements.  Anyone who has experimented with this long enough will
> realize that this approach is a loser.
> 
> A second approach attaches metadata (like dates) to a "collection of
> statements", where the collection might be an RDF bag or list.  This
> is a big improvement over the previous approach, but we can do better.
> 
> The simplest approach invents a context object, and points a context
> to the reified statements within it (or points the statements to the
> context).  This approach is isomorphic to the second approach, but is
> slightly cleaner and probably more efficient.
> 
> Here we have rewritten the first query using only triples, and
> contexts that include reified statements:
> 
> SELECT ?f
> WHERE  ((?f type Freighter),
> 	(?st1 type Statement),
> 	(?st1 subject ?f),
> 	(?st1 predicate location),
> 	(?st1 object antwerp),
> 	(?st2 type Statement),
> 	(?st2 subject ?f),
> 	(?st2 predicate hasCargo),
> 	(?st2 object ?cargo),
> 	(?st3 type Statement),
> 	(?st3 subject ?cargo),
> 	(?st3 predicate consistsOf),
> 	(?st3 object AluminumPipe),
> 	(?st1 inContext ?c),
> 	(?st2 inContext ?c),
> 	(?st3 inContext ?c),
> 	(?c beginDate ?begin),
> 	(?c endDate ?end),
> 	(?begin before "May 1 2003"),
> 	(?end after "March 31 2003"))
> 
> 
> This gets the job done, but its really quite awful.  Not only is it
> much less readable, but it is MUCH less efficient than the quad
> representation.  Why is that?  First of all, the number of "joins" is
> much larger.  Second, and probably more damaging, the optimizer now
> has to optimize over predicates like "subject", "predicate" and
> "object" that mix together extensions of many different predicates.  A
> database consisting mostly of temporally-situated data will have to
> reify the majority of its data, so these three predicates will likely
> contain most of the data in the database. Not a pretty sight.
> 
> In fact, it IS possible to stick to triples instead of quads, and
> still produce a practical means for representing temporally situated
> data.  I'm looking for examples of other RDF-based systems that have
> successfully solved this problem.  Any takers (in addition to 
> a reference,
> I would like to see how my query would be represented in your system)?
> 
> Cheers, Bob
>
Received on Wednesday, 10 September 2003 05:29:40 UTC