Re: Semantics of Qurtle (N3 vs TriG), Graph Literals again. from Ivan Herman on 2011-03-04 (public-rdf-wg@w3.org from March 2011)

From: Ivan Herman <ivan@w3.org>
Date: Fri, 4 Mar 2011 09:55:26 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: public-rdf-wg <public-rdf-wg@w3.org>
Message-Id: <53F1FEBA-45F5-4DC0-8BBD-D525B3F5616C@w3.org>
On Mar 4, 2011, at 03:41 , Sandro Hawke wrote:

> (Aside: let's keep using the name Qurtle, on a *temporary* basis, to
> refer to our deliverable of a Turtle-like language with "support for
> multiple graphs and graph stores".  

I am not sure how does one pronounce "Qurtle" :-)

> I don't like the name long-term,
> but it's fine for now.  This post is orthogonal to whether Qurtle is
> minimal functionality n-quads or maximal functionality
> Superturtle/TriG++, so I want a neutral name.)
> 
> There have been several posts about how it's not clear what the
> fourth element means.  I want to point out that N3 has an interesting
> take on the problem; rather than decide and declare a priori the
> relation between the triples and the extra URI, it lets the author
> decide and tell the reader via an RDF predicate (examples below).
> 
> So, here's a TriG document D:
> 
>    @base <http://example.com/> .
> 
>    <u1> = { <a> <b> <c> . }
>    <u2> = { <a> <b> <c>.  <b> <b> <c>. }

I think the TriG syntax does not use the '=' sign. 

> 
> I think there are two main schools of thought about what this means,
> corresponding to whether we think u1 and u2 identify g-snaps or
> g-boxes.
> 
> Option 1 - We might take u1 and u2 as identifying g-snaps.  In this
> case, D is telling us that the URI "http://example.com/u1" is an
> identifer for a particular g-snap (abstract/mathematic set of one
> triple), which we can write down using this turtle g-text, "@base
> <http://example.com/> .  <a> <b> <c> ."  Similarly, it tells us
> "http://example.com/u2" identifies a g-snap of two triples.
> 
> In n3 (as I understand it; I don't think this part is formally
> specified), we could write this meaning like this:
> 
>    @base <http://example.com/> .
>    @prefix owl: 
> 
>    <u1> owl:sameAs { <a> <b> <c> . }
>    <u2> owl:sameAs { <a> <b> <c>.  <b> <b> <c>. }

To be honest, I am not sure I understand this. Using sameAs would mean that the the '{...}' syntax is an RDF concept/resource that has a valid place in a triple. Ie, it is either a syntactic sugar for a literal (ehem, opening up the graph literal issue...) or a resource with some sort of a URI... Or we have to have a new RDF concept for a g-snap that can be used as a legitimate part of an RDF triple.

> 
> Option 2 - We might take u1 and u2 as identifying g-boxes.  In this
> case, D is telling us that "http://example.com/u1" identifies a
> container of triples which currently contains one triple, as shown.

Does it say that <u1> contains _exactly_ that triple or that it does contain that triple by may contain more?

> We could reasonably expect that, barring things changing, we could do
> a GET on "http://example.com/u1" and get back the Turtle content,
> "@base <http://example.com/> .  <a> <b> <c> ."  If we got D from a
> trusted source, and for one reason or another we're not worried about
> things changing, we could skip doing that GET, because we know the
> result already.
> 
> In n3 (again, as I understand it), we could write this meaning as:
> 
>    @base <http://example.com/> .
>    @prefix owl: <http://www.w3.org/2002/07/owl#>.
> 
>    <u1> log:semantics { <a> <b> <c> . }
>    <u2> log:semantics { <a> <b> <c>.  <b> <b> <c>. }
> 
> ("The log:semantics of a document is the formula which one gets by
> parsing a [it]." [1] For "formula" read "graph", for our purposes.)

And I think my comment above applies again

> 
> Are there other common meanings?  There are other relationships that
> resources can have with triples, of course:
> 
>  - a person can assert/claim some g-snap
>  - a person can be the author/creator of some g-snap
>  - n-ary: a person can assert some g-snap over some time range
>  - ... etc
> 
> but all of these can be done using the Option-1 (g-snap) or Option-2
> (g-box) interpretations, like this:
> 
>    my:Sandro eg:claims <u1> .
> 
> That would be defined to means either that I claim the g-snap u1 or
> that I claim whatever is in the g-box u1, depending on which solution
> we are using.
> 
> So, I don't know that it matter very much which way we go.  In my own
> coding, in part because I'm usually using a mutable quad store, I
> think of it as Option-2 (g-boxes), BUT I only use my own URI space (so
> it never changes without me knowing about it), and there's usually a
> set of URIs which I treat as immutable and think of as effectively
> being g-snap identifiers.  When I fetch stuff off the web, I store
> that explicitly, keeping each version as long as necessary, with its
> own URI.
> 
> I will note -- returning to a topic of some earlier emails -- that some
> of the use case for Qurtle can be addressed by just defining datatypes
> for the RDF syntaxes.  For example, we can write D in ordinary Turtle,
> with Option-1 semantics, like this:
> 
>    @base <http://example.com/> .
>    @prefix owl: <http://www.w3.org/2002/07/owl#>.
>    @prefix rdfsyn: <http:://example.org/rdf-syntaxes/> .
> 
>    <u1> owl:sameAs "@base <http://example.com/> . { <a> <b> <c> . }"^^rdfsyn:turtle
>    <u2> owl:sameAs "@base <http://example.com/> . { <a> <b> <c>.  <b> <b> <c>. }"^^rdfsyn:turtle
> 

So these are graph literals after all. And one can define

{ <a> <b> <c> . } 

as being a syntactic shorthand for the 

"@base <http://example.com/> . { <a> <b> <c> . }"^^rdfsyn:turtle

literal, just as 

123.45

is a shorthand for

"123.45"^^xsd:float

> The quoting gets a little hairy to do by hand, both in Turtle and
> RDF/XML, but it's pretty easy for machines.  No special parser is
> needed, and systems which don't know this datatype will, I think,
> effectively ignore the triples, as they probably should.  If we want
> option-2 semantics, I think we'd need to make up a new predicate, like
> rdf:content or something.

you mean having both, right? 

<G> { .... }

would mean option 1 and an extra syntax would give option 2

> 
> Where this falls short, I think, is in ease-of-hand-authoring and in
> not allowing bnodes to be shared between the graphs.  But a lot of
> people don't want that anyway and may be happy to discourage it like
> this.  Also, it's not as easy to process as n-quads, especially
> for massive dumps, and some mechanism would need to be introduced for
> signaling the default graph.  (Something like "<> eg:defaultGraph
> <g1>.")
> 
> (Note re [2], Ivan, these are literals just like xs:integer, and don't
> open up any new issues.  There's no more need for them to be subjects
> than for integers to be subjects.  The value space is g-snaps, the
> lexical space for the turtle one is the set of turtle g-texts, etc.)

As long as the syntax does not _require_ a literal as subject, I have no problem talking about graph literals. We just have to be very clear in our minds that graph literals as subject cannot be looked at in isolation, but only with relation to the much more general issue of literals as subjects in general...

Ivan




> 
>     -- Sandro
> 
> [1] http://www.w3.org/2000/10/swap/doc/Reach.html
> [2] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Feb/0127
> 
> 
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Friday, 4 March 2011 08:53:56 UTC