Re: What fourth column should be from Steve Harris on 2011-11-16 (public-rdf-wg@w3.org from November 2011)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 16 Nov 2011 17:59:38 +0000
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: public-rdf-wg@w3.org
Message-Id: <3AF7DC3C-A3A8-40DB-AC11-D0569E66870A@garlik.com>

On 2011-11-16, at 17:23, Antoine Zimmermann wrote:

> As we ran out of time to discuss this, I would like to say that having literals in the 4th position of N-Quads is very useful. Especially, think about xsd:dateTime, xsd:anyURI.
> 
> The advantage of this is that what typed literals denote is unambiguous. So you know you are referring to the time when using an xsd:dateTime-typed literal. You also know that you are referring to a URI when using xsd:anyURI, instead of referring to the thing denoted by the URI.
> 
> It makes extensions of RDF easier for, e.g., temporal RDF, RDF with trust (use a xsd:decimal to indicate level of trust/confidence), provenance-RDF (use xsd:anyURI to denote the provenance URL unambiguously), etc.
> 
> If one uses a URI instead, it is always up to interpretations what that URI denotes. It could denote the graph itself but could as well denote the document where the triple is found or the main entity in the graph, as we already discussed. Using a URI is more flexible, though, so it must be allowed too.

For the record I don't feel this is a good idea. There are many systems that implement quads, but I'm not aware of any that allow literals in the 0th/4th slot. that suggests that users haven't requested it much, and that we don't have implementation experience.

I'm not sure how many systems that track time and/or provenance can be boiled down to a single literal — trust=0.23, or date=2011-11-16 seems a bit simplistic — it certainly wouldn't work for us.

> However, I don't see interesting use cases for bnodes in the 4th column.

TL;DR: I have one, but I'm not sure I'm prepared to defend it.

Around 10 years ago (pre SPARQL, and when I still believed reasoning was generally useful… wow, time flies) I built an RDFS store that kept its forward-chained inferred triples in graphs identified by bNodes - there was a "system" graph which tracked the graph bNodes. It had a bNode-identified graph for every combination of cross-graph inferences, e.g. if triple X was inferred from graph A and graph B there would be statements in the system graph like:

_:graphI1 :dependsOn <A>, <B>.

and X would appear in _:graphI1

It was done this way to allow graph-level retractions (If A is removed, remove all bNode graphs ?g, where { ?g :dependsOn <A> }), and to prevent users adding or deleting things in inference graphs. Plus it answers the taxing question of what graph do inferred triples live in.

Other options would involve "reserving" a chunk of URI space to use for this kind of purpose, and I don't like that idea. But OTOH I don't care about this use case anymore anyway, and SPARQL syntax kinda clobbers it.

- Steve

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Received on Wednesday, 16 November 2011 18:00:09 UTC