W3C home > Mailing lists > Public > www-rdf-interest@w3.org > September 2003

RE: Representing temporal data in RDF

From: Bob MacGregor <macgregor@ISI.EDU>
Date: Tue, 09 Sep 2003 13:06:13 -0700
Message-Id: <5.1.1.6.0.20030909125818.01eb2b50@tnt.isi.edu>
To: "Geoff Chappell" <geoff@sover.net>, <www-rdf-interest@w3.org>
Hi Geoff,

I'll score one more point for the quad community.  You mention attaching
read/write privileges to context.  For our usage, that would not be
a good choice.  Currently, we have hundreds of contexts within a single
model, and that will become thousands (or more) later on.  That's
because each model contains many many statements about where things
are located temporally and spatially.  Hence, we prefer to attach user access
statements to models rather than contexts.

I mention this because some folks equate the notions of context and
model/graph.  For the reason just mentioned, we consider them to be
quite distinct.  Any system that makes such an assumption would likely
perform very badly on our datasets.  Imagine an RDQL query with a thousand
entries in its FROM list.

Cheers, Bob


At 02:26 PM 9/9/2003 -0400, Geoff Chappell wrote:

>In RDF Gateway's query language[1] your query would look something like:
>
>SELECT ?f USING sometable WHERE {[type] ?f [Freighter]}
>         AND {?c [location] ?f [antwerp]}
>         AND {?c [hasCargo] ?f ?cargo}
>         AND {?c [consistsOf] ?cargo [AluminumPipe]}
>         AND {[beginDate] ?c ?begin}
>         AND {[endDate] ?c ?end}
>         AND DateDiff('s', DATE(?begin), DATE("2003-05-01")) > 0
>         AND DateDiff('s', DATE(?end), DATE("2003-03-31")) < 0
>
>We support quads for the reasons that you list in your email. If no
>context/fourth member is specified, the triple has a null context.
>
>Another handy use for the context.... We support permissioning of
>triples based upon the context. E.g. if a user has read permissions to
>the [ex:A] context but not the [ex:B] and this data exists:
>
>         {[ex:A] [color] [Fido] [Brown]}
>         {[ex:B] [color] [Fido] [Black]}
>
>and the following query is executed by that user:
>
>         SELECT ?c USING mydata WHERE {[color] [Fido] ?c}
>
>Results will be:
>
>?c
>=======
>[Brown]
>
>
>rgds,
>
>Geoff Chappell
>
>[1] http://www.intellidimension.com
>
>
>
> > -----Original Message-----
> > From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest-
> > request@w3.org] On Behalf Of Bob MacGregor
> > Sent: Tuesday, September 09, 2003 1:42 PM
> > To: www-rdf-interest@w3.org
> > Subject: Representing temporal data in RDF
> >
> >
> >
> > I have yet to come across a system having a substantial user base that
>
> > does an adequate job of representing temporally situated data.  If you
>
> > think about it, most facts about the world are either true only within
>
> > a specific temporal interval, or they are things like events that have
>
> > a time component built in (mathematical facts are an exception).  In
> > other words, when it comes to representing day-to-day facts, there is
> > a large opportunity waiting.  Ideally, the Semantic Web tools could
> > fill this void.  Unfortunately, the tools that RDF provides us are
> > almost unbearably clumsy.  Below is an example.
> >
> >
> > Here is my example query:
> >
> > "Retrieve freighters that visited Antwerp on
> > April 2003 whose cargo included aluminum pipes"
> >
> >
> > The best solution that I have come across for representing this query
> > uses quads.  A "quad" is a 4-tuple <?c ?s ?p ?o> with roles context,
> > subject, predicate, object, i.e., its a triple with an extra context
> > field.
> >
> > Here is the query in a RDQL variant that supports a quad syntax
> > instead of a triple syntax, with namespaces omitted:
> >
> > SELECT ?f
> > WHERE  ((null ?f type Freighter),
> >       (?c ?f location antwerp),
> >       (?c ?f hasCargo ?cargo),
> >       (?c ?cargo consistsOf AluminumPipe),
> >       (null ?c beginDate ?begin),
> >       (null ?c endDate ?end),
> >       (?begin before "May 1 2003"),
> >       (?end after "March 31 2003"))
> >
> > This is actually quite a reasonable query.  Its fairly concise, and
> > fairly readable.  Unfortunately, I'm not aware of any system that
> > implements quads and has a significant user base.  For the moment,
> > quads are still a wish that has not come true.
> >
> > Now, lets consider expressing this same query using only triples.  RDF
>
> > does not provide any officially-sanctioned way to do this, so we have
> > to improvise.  RDF provides the notion of a reified statement, but
> > there is more than one way to use reified statements.
> >
> > One approach attaches dates and other metadata directly to reified
> > statements.  Anyone who has experimented with this long enough will
> > realize that this approach is a loser.
> >
> > A second approach attaches metadata (like dates) to a "collection of
> > statements", where the collection might be an RDF bag or list.  This
> > is a big improvement over the previous approach, but we can do better.
> >
> > The simplest approach invents a context object, and points a context
> > to the reified statements within it (or points the statements to the
> > context).  This approach is isomorphic to the second approach, but is
> > slightly cleaner and probably more efficient.
> >
> > Here we have rewritten the first query using only triples, and
> > contexts that include reified statements:
> >
> > SELECT ?f
> > WHERE  ((?f type Freighter),
> >       (?st1 type Statement),
> >       (?st1 subject ?f),
> >       (?st1 predicate location),
> >       (?st1 object antwerp),
> >       (?st2 type Statement),
> >       (?st2 subject ?f),
> >       (?st2 predicate hasCargo),
> >       (?st2 object ?cargo),
> >       (?st3 type Statement),
> >       (?st3 subject ?cargo),
> >       (?st3 predicate consistsOf),
> >       (?st3 object AluminumPipe),
> >       (?st1 inContext ?c),
> >       (?st2 inContext ?c),
> >       (?st3 inContext ?c),
> >       (?c beginDate ?begin),
> >       (?c endDate ?end),
> >       (?begin before "May 1 2003"),
> >       (?end after "March 31 2003"))
> >
> >
> > This gets the job done, but its really quite awful.  Not only is it
> > much less readable, but it is MUCH less efficient than the quad
> > representation.  Why is that?  First of all, the number of "joins" is
> > much larger.  Second, and probably more damaging, the optimizer now
> > has to optimize over predicates like "subject", "predicate" and
> > "object" that mix together extensions of many different predicates.  A
>
> > database consisting mostly of temporally-situated data will have to
> > reify the majority of its data, so these three predicates will likely
> > contain most of the data in the database. Not a pretty sight.
> >
> > In fact, it IS possible to stick to triples instead of quads, and
> > still produce a practical means for representing temporally situated
> > data.  I'm looking for examples of other RDF-based systems that have
> > successfully solved this problem.  Any takers (in addition to a
> > reference, I would like to see how my query would be represented in
> > your system)?
> >
> > Cheers, Bob

=====================================
Robert MacGregor
Senior Project Leader
macgregor@isi.edu
Phone: 310/448-8423, Fax:  310/822-6592
Mobile: 310/251-8488

USC Information Sciences Institute
4676 Admiralty Way, Marina del Rey, CA 90292
=====================================
Received on Tuesday, 9 September 2003 16:06:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:52:02 GMT