Re: time and quads and semantics

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Wed, 15 Feb 2012 15:18:58 +0100
Message-ID: <4F3BBED2.10302@emse.fr>
To: public-rdf-wg@w3.org, Pat Hayes <phayes@ihmc.us>
What's worrying me in this proposal is that it makes a really deep 
change to RDF. RDF would not be anymore a data model based on triples.

However, here is an idea that would reuse most of your proposal:

1) we keep RDF as it is (modulo the small fixes like plain literals, 
etc). People still publish triples online.

2) we propose a data model based on quadruples, distinct from RDF, but 
which would serve as a format to manage multiple RDF graphs, or changing 
RDF graphs. This is not meant to be published as is.

3) we define a semantics for that data model, and the semantics would be 
essentially the same as RDF except that IEXT would be a ternary relation.

4) we declare victory.

It's still not providing a complete solution for temporal information, 
but it's bringing a framework that can be easily extended to temporal 
reasoning, provenance management, etc.

One issue is what can be put in 4th position. I'd like literals, which 
would be the best way to deal with temporal validity, among other 
things, IMO.


Le 15/02/2012 07:33, Pat Hayes a écrit :
> OK yall, I promised 2 weeks ago to try to write this down, but it has
> gotten a little changed since then, so you are getting the newer
> version. If it seems to be verging off the point, please bear with me
> for a while.
> The issue that started it was, how to deal with the fact that RDF is
> a 'timeless' description logic, but people often (want to, and
> therefore will) use it to describe facts that are transient,
> changing, labile, etc.., and also that are true *now* but might not
> have been true a while ago or at some time in the future. There is a
> basic divergence between descriptions that are seen as timeless or
> outside of time, like mathematical statements or ontological
> statements, and statements - usually data - which are understood to
> be relative to a 'present time' and are therefore, like dairy
> products, always have a use-by date, even if this is implicit. RDF is
> semantically designed for the former, but gets used for the latter,
> and what should we do about this? Possible answers range from (do
> nothing and to hell with the semantics anyway) to (insist that RDF is
> timeless and say that time-sensitive data MUST be phrased in some
> rebarbative way involving blank nodes, in order to preserve the
> semantic purity) but neither extreme is palatable to everyone.
> Now, other formal logical notations have met this issue many times,
> and there is a kind of rough consensus about how to deal with it.
> Basically, if you want to be able to store the notation and use it
> later, then it *can't* be based on a time-sensitive 'present-tense'
> kind of a semantic model: it has to be timeless, at bottom. And then
> to encode present-tense information in a time-free notation, you have
> to include a time parameter somewhere. You date-stamp the data, or
> you have a temporal field in your data table, or you have an extra
> 'situation' argument in your relations, or some such device. It
> almost doesnt matter where you put this extra parameter, as long as
> there is a recognized convention for finding it; but one very common
> idea is, you make it an extra argument of all your time-sensitive
> relations (and if you are in AI, you call them 'fluents') So what was
> a simple property becomes a relationship to a time, what was a binary
> relation between As and Bs becomes a three-place relation between As,
> Bs and times (or some other parameter related to times in some known
> way, a complication I will ignore.)
> The snag with doing this in RDF is, of course, that RDF isnt very
> good at representing three-place relations. In fact, although
> theoretically simple, it is in practice so awkward that hardly anyone
> is going to actually do it. You have to introduce things called
> 'events' or 'holdings' or 'facts' and say that they have a subject
> and an object and a time, using three triples. Its just like RDF
> reification, in fact. Blech.
> Now bring quad stores into the picture, and they seem to provide
> exactly what we need here. A triple :a :R :b . turns into exactly
> what we need to encode time-sensitive information: a relation with
> *three* parameters: :a :R :b :t . That "graph label" can be used to
> separate a triple true at one time from the 'same' triple true at a
> different time. Perfect!  Except that no, it isn't, because this
> isn't what the RDF semantics says it means. The current semantics
> does not have  the *truth* of a triple varying according to what
> graph it happens to be in: what the triple says depends only on the
> interpretation of its components, the subject, predicate and object
> of the triple. Which is where we are currently stuck.
> So, here is a proposal. We extend RDF to allow property extensions to
> contain triples as well as pairs. That is, we allow an RDF property
> to be a trinary as well as a binary relationship. (Strictly, we allow
> it to be a variadic relation which can be binary or trinary, or
> both.) Notice that this has the current semantics as a special case,
> but generalizes it a little. And then we allow, under some
> circumstances – details later – an RDF property to be interpreted as
> taking three arguments rather than the usual two. Call the extra
> argument a 'parameter' for want of a better term. Then we can then
> think of a quad :s :P :o :pa as consisting of a subject, property,
> object and parameter, in that order; and it is true in I just
> when<I(:s), I(:o), I(:pa)>  is in IEXT(I(:P)), which takes advantage
> of the new RDF semantics. (Note that this makes sense even when I is
> a current RDF interpretation, it just always comes out false. The
> 'trinary' extension allows some quads to be true in an
> interpretation.)
> Under this semantic regime, then, there are two ways to think about
> what a quad store is saying. In one of them, it consists of sets of
> RDF triples with their truth depending upon a *binary* property, and
> the fourth field is simply a label for all the triples in each graph,
> AKA a 'graph name'. But the truth of a triple does not depend on the
> graph it is in: this graph label is just an organizing device with no
> semantic import. In other words, what we have now. But there is
> another way to think about a quad store, in which it bears the same
> relationship to quads as an RDF graph bears to triples: it is simply
> a conjunction of a lot of atomic facts, but each atom is now a
> relation applied to three arguments, and each argument has just as
> much bearing on the truth of the quad as the others do. Seen from
> this second perspective, the 'triples' view is simply one way to
> slice the quad store, using the last argument as the organizing
> parameter. And now it is natural to treat time-stamped data as living
> in a quad store whose parameter denotes times or time-intervals. Of
> course, we can also 'see' such a quad store in the first way,
> treating the time parameter as a graph label, and this might be a
> natural way to think about it for processing purposes,   but the
> second view incorporates the time-varying nature of data which is
> indeed **parameterised** by the time (if you like, by the graph it
> happens to be in, intuitively) rather than simply being labelled by
> it. The second view allows data to actually depend upon the time
> parameter, instead of simply being organized by it.
> If one prefers to think of data as consisting of RDF graphs written
> in a 'present tense' but then recorded and stored with a time-stamped
> label, this is a perfectly legitimate and appropriate way to view
> such a quad store, provided that one bears in mind that the bare
> triples must not be taken out of context. For example, merging graphs
> with different graph labels is not valid, under this convention.
> Merging two quad stores (where 'merge' here is defined exactly as for
> RDF graphs but with 'triple' replaced by 'quad' throughout) *is*
> semantically correct, however: in fact, quad stores with the second
> parametric interpretation 'work' exactly like RDF graphs do with the
> current semantics, and all the standard definitions (merging,
> instances, being grounded, being lean, etc..) work in exactly the
> same way.
> This all works out quite nicely and naturally, but it there is one
> big issue. If we are given a quad store, how do we know whether to
> interpret it as consisting of triples with labels, or as consisting
> of quads with an extra parameter? It is important to be able to make
> the distinction, since the same quad store could be true in one view
> but false in the other, in the same interpretation. There are several
> ways to handle this, and I am working on a couple of ideas right now.
> Hopefully I will have an example by tomorrow, but any comments so
> far?
> Pat
Received on Wednesday, 15 February 2012 14:19:33 UTC

