Re: SPARQL and Turtle Prefix Placement from Gregg Kellogg on 2012-06-14 (public-rdf-wg@w3.org from June 2012)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Thu, 14 Jun 2012 16:24:24 -0400
To: David Wood <david@3roundstones.com>
CC: Gavin Carothers <gavin@carothers.name>, Eric Prud'hommeaux <eric@w3.org>, W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <2FE0E96F-6FE6-4BBD-BB49-090115C91B94@greggkellogg.net>

Gregg

On Jun 14, 2012, at 11:02 AM, David Wood wrote:

> Hi Gavin and Eric (and everyone else),
> 
> I just noticed that the placement of the PREFIX names differ in the SPARQL and Turtle grammars:  Turtle allows prefixes to be anywhere, but SPARQL requires them to be at the top.
> 
> The relevant section from the Turtle grammar [1] is:
> [[
> [1]	turtleDoc		::=	(statement)*
> [2]	statement		::=	(directive '.') | (triples '.')
> [3]	directive		::=	prefixID | base
> [4]	prefixID		::=	'@prefix' PNAME_NS IRIREF
> [5]	base			::=	'@base' IRIREF
> ]]
> ...and the relevant section from the SPARQL 1.1 grammar [2] is:
> [[
> [1]  	QueryUnit	::=  	Query
> [2]  	Query		::=  	Prologue
> ( SelectQuery | ConstructQuery | DescribeQuery | AskQuery )
> BindingsClause
> ...
> [4]  	Prologue		::=  	( BaseDecl | PrefixDecl )*
> [5]  	BaseDecl		::=  	'BASE' IRI_REF
> [6]  	PrefixDecl	::=  	'PREFIX' PNAME_NS IRI_REF
> ]]
> 
> Should we align the two grammars so the prefixes must be at the top, as in SPARQL?  I tend to think so, in consideration of our ISSUE-1 [3].  The obvious downside would be a stricter requirement on Turtle authors to produce leading prefixes (which some in the wild don't currently).

I disagree, the needs for Turtle and SPARQL are different, even though the grammars are aligned. SPARQL documents tend to be much shorter, and appropriate for in-memory parsing and serializing.

Turtle documents can be _much_ larger (I've seen test cases in the Giga-byte range). A streaming serializer may not know what prefixes are necessary when starting output, and may only discover this partway through processing. Allowing a @prefix to be defined here supports this case.

It's also possible (modulo base IRI expansion) to process multiple Turtle documents by concatenating them together, which in some systems, can save some parser startup overhead.

I've also seen usage (probably mostly from N3) where @base is repeated at different points to "reset" the base for relative IRI evaluation.

I think we need to keep the current definition.

Gregg

> The benefits would include easier reading and maintenance of the prefixes, as well as forced alignment with SPARQL's requirement in Section 19.5 that "A prefix declared with the PREFIX keyword may not be re-declared in the same query." [4]
> 
> Regards,
> Dave
> 
> [1] http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#sec-grammar-grammar
> [2] http://www.w3.org/TR/sparql11-query/#grammar
> [3] http://www.w3.org/2011/rdf-wg/track/issues/1
> [4] http://www.w3.org/TR/sparql11-query/#iriRefs
> 
> 
>

Received on Thursday, 14 June 2012 20:25:59 UTC