Re: Fwd: SPARQL: graph syntax should be N3 subset from Seaborne, Andy on 2004-12-02 (public-rdf-dawg@w3.org from October to December 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 02 Dec 2004 18:01:24 +0000
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <41AF5874.9020507@hp.com>
Tom Adams wrote:
>  From the comments list...
> 
> Resent-From: public-rdf-dawg-comments@w3.org
> From: Tim Berners-Lee <timbl@w3.org>
> Date: 29 November 2004 3:36:36 PM
> To: public-rdf-dawg-comments@w3.org
> Subject: SPARQL: graph syntax should be N3 subset
> 
> 
> Reading the 2004/10/13 draft of SPARQL.
> 
> The grammar for SPARQL frequently involves graph patterns.   These 
> should use the N3 grammar, specifically a subset at a level to the 
> subset known as Turtle.

There has been some confusion in this area up to now.

Let's be clear when we are talking about syntax, and when we are talking about
N3/cwm semantics - the N3 computation model of log:implies, other log:* 
operations, the underlying data model, which includes quoted graphs and is 
outside strict RDF.

"Using an N3 grammar" can range from aligning syntax where there is overlap to 
making SPARQL a subset of N3.  I am in favour of alignment where sensible but 
not on a strict subset of N3.  Future possible modifications to N3 are too 
speculative for me at the moment - creating a dependence on where that might go 
and when it might happen is too high a risk.  Better would be to acknowledge 
that there can be different query syntaxes of the same query computation model. 
  Produce one syntax now and not get into dependences unnecessarily.

 > - Allows one to create some example data, view it as N3, and then paste
 > it into the 'construct' clause, replacing a few values with variables.

This is a compelling argument IMHO.  It suggests using N3-like syntax to capture 
triple patterns.


So what are the differences?  If there are places where we can make it the
same them let's do that.

Directives:

SPARQL has PREFIX, N3 has @prefix and @ is a general directive introducing
character.  @prefix needs a trailing dot.

Triple syntax:

SPARQL has (?x ?y ?z) where N3 has ?x ?y ?z .

Lexical spaces:

SPARQL follows XML 1.1.  This happens to mean that _:a is a legal qname which 
probably isn't a good idea but it is a consequence of following XML 1.1 closely.

After these areas, Jos has should how optionals can map to N3 but I find the 
idioms necessary opaque and don't think there is sufficient value in a common 
syntax when the user has to use some unnatural way of writing queries.

 > It may be that at the same time we should plan for the standardization
 > of N3 itself, with spcific view to keeping Ntriples Turtle SqarQL and
 > N3 full in sync.

This is a worrying - there are no plans I know of for standardization so "at the 
same time" means pausing DAWG while a WG starts or a SWBPD task force does 
something.  Both are significant delays.

If SPARQL fixes on one form of N3, would compatibility, as it makes sense, be a 
formal constraint on standardization of N3?

 > (Similarly, the list construct for
 > collections makes it possible to actually use lists in practice, where
 > elaborations in terms of rdf:first and rdf:rest are impractically
 > cumbersome.)


I had to delve into what this means by trying something out in cwm:

First I tried:
----
( "a" ) .
{ ( ?x ) } => { :a :b :c } .
----
and do indeed get a match but cwm does not print the ( "a" ) with --data 
(incidently, it can't print this out at all - without --data I get an empty 
formula and no list.

----
:a :p ( "a" ) .
{ ( ?x ) } => { :a :b :c } .
----
Match again.

Replacing the variable in the RHS of the rules:
----
:a :p ( "a" ) .
{ ( ?x ) } => { :a :b ?x } .
----
I don't get a match. Nor if @forAll used for the variable.


So I tried to just pick the rdf:first property out:
----
:a :p ( "a" ) .
{ ?x rdf:first "a" } => { :a :b :c } .
----
I don't get a match.

Also I don't get any triples for plain:
----
( "a" )
----
and I can see two there:
----
_:b1      rdf:first  "a" .
_:b1      rdf:rest  () .
----

Conclusion: cwm handles lists internally as something other than the RDF structure.

Question: how do I test for "member of collection"?


Summary: We can align syntax without creating dependences on things that don't 
yet exist.

	Andy

> 
> There advantages to the languages overlapping.
> 
> - Cuts the learning curve for people learning SPARQL and N3.
> - Allows code sharing for those implementing both languages
> - Allows data to be searched for to be pasted into a query.
> 
> N3 is a very suitable syntax for this:
> 
> - N3 has commonly used subsets Turtle and NTriples which are widely 
> deployed
> - N3 is a syntax which meets exactly the same goals as SPARQL, in being 
> concise and human-friendly representation of a graph with variables;
> - N3 has been used in the SPARQL document itself for readability for 
> the data.
> - N3 has evolved in response to community needs in the RDF Interest 
> Group and SW Interest Group.
> 
> N3 has come a long way since it started as a triples language:
> 
> - The comma and semicolon were added very early on a shortcuts making 
> both reading and writing easier when subject [and predicate] are 
> repeated.   While it is true that SPARQL's current individual triples 
> form is simpler, I strongly believe that the users would tired of it 
> once they become familiar with it.  (Similarly, the list construct for 
> collections makes it possible to actually use lists in practice, where 
> elaborations in terms of rdf:first and rdf:rest are impractically 
> cumbersome.)
> 
> - The grammar of the language is now defined in a context-free grammar 
> in RDF itself.
> 
> - Much of the actual nitty-gritty questions about the language involved 
> details of tokenizing, sets of characters allowed for identifiers, and 
> escaping.  The hassle comes from coordinating the XML, and N3 at the 
> NTriples, Turtle and N3 levels, and all the parsers involved.  To have 
> to add another randomly different language to this mix will make it 
> more difficult.
> 
> - The only significant change which has been proposed is to add syntax 
> for unordered sets similar to that for ordered collections.
> 
> 
> This is a strong suggestion.  I believe that the community will be best 
> served in making this change, and doing so as soon as possible.
> 
> It may be that at the same time we should plan for the standardization 
> of N3 itself, with spcific view to keeping Ntriples Turtle SqarQL and 
> N3 full in sync.
> 
> Tim Berners-Lee
> 
> 
> ___________________________________________
> PS:
> 
> If this is done, the N3 syntax could be extended to include the 
> keyword-style which the group seems to prefer for SPARQL. Assuming an 
> N3 semantics for SPARQL exists, then the sparql keywords could be 
> deemed to add extra syntactic shortcuts to the language.
> 
> @keywords select, from, where, prefix, option.
> 
> prefix   soc: <whatever>.
> select   ?x, ?y, ?z
> from     <mydata.rdf>
> where
> 	?x    a soc:Person;
> 		phone:number   "+1 781 555 1212";
> 		fam:sister	?y.
>         ?y   phone:number ?z.
> 
> Making SPARQL a subset of N3 would allow a SPARQL query to be quoted in 
> an N3 document, which would allow it to be carried as a payload in more 
> complex things, which might for example provide extra metadata about a 
> query.
Received on Thursday, 2 December 2004 18:01:50 UTC