W3C home > Mailing lists > Public > public-rdf-wg@w3.org > June 2012

Re: SPARQL and Turtle Prefix Placement

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Fri, 15 Jun 2012 20:13:32 +0100
Message-ID: <4FDB895C.5040402@epimorphics.com>
To: Eric Prud'hommeaux <eric@w3.org>
CC: Gavin Carothers <gavin@carothers.name>, public-rdf-wg@w3.org
I prefer Gavin's approach.

No BASE PREFIX; Put '@base' and '@prefix' in the directives.

http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0353.html

(and it works in parser generators I have used)

	Andy

On 15/06/12 19:56, Eric Prud'hommeaux wrote:
> * Gavin Carothers<gavin@carothers.name>  [2012-06-15 10:44-0700]
>> On Fri, Jun 15, 2012 at 9:48 AM, Eric Prud'hommeaux<eric@w3.org>  wrote:
>>> +[20]   LANGTAG         ::=     BASE | PREFIX | '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
>>
>>
>> No, reverting back to the PREFIX BASE terminals is not acceptable.
>> This was already the subject of review by Andy and Peter.
>>
>> Please see thread
>> http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0347.html
>> for discussion on the change from PREFIX BASE to a simpler LANGTAG.
>
> But that thread didn't terminate in consensus.
> Andy's point
> [[
>      (to the casual reader : BASE is '@base' and PREFIX is '@prefix'
>
>      Which is ambiguous - as it says:
>
>      LANGTAG ::= ('@base' | '@prefix' | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+)
>
>      so the string "@base" matches two ways.
>
>      But even if sorted out ... it means a tokenizer may well generate the
>      token LANGTAG ... and then:
>
>      [5]		base		::= 	BASE IRIREF
>
>      does not match as the token is LANGTAG, not BASE.  Oops.
> ]]
>
> is addressed by moving the "BASE | PREFIX | " from LANGTAG to RDFLiteral:
>
>    RDFLiteral ::= String (BASE | PREFIX | LANGTAG | '^^' iri)?
>
> Turtle doesn't talk about parsing rules (perhaps it should); SPARQL's note 3 says [[
> When tokenizing the input and choosing grammar rules, the longest match is chosen.
> ]] —<http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#sparqlGrammar>
>
> This doesn't establish a relative order between terminals implied by ""'d strings in the productions vs. explicit terminals like "LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*". After failing a few tests, people would likely add an order to make "@base" and "@prefix" parse as implicit terminals and never parse them as language tags. We can be much more explicit if use the above production for RDFLiteral. An aesthetic option would be to break it up for semantic clarity:
>
>    RDFLiteral  ::= String (LanguageTag | '^^' iri)?
>    LanguageTag ::= BASE | PREFIX | LANGTAG
>
> I've commited that for everyone's viewing pleasure.
>
> I also found some errors in STRING_LITERAL ("s vs. 's reverse, so 's not allowed within "" string). I'm now validating with this text (note the long quotes):
> [[
> []<p>  <o1>, "o2", [<p2>  _:o3 ] ;
>     <p3>  (<o4>  "o5"@base "o5"@prefix _:o6 [<p4>  <o8>  ] ),<o9>  .
> [<p5>  """o10
> ""line"" '''2'''""", '''o11
> ''line'' """3"""'''^^<integer>  ;
>    <p6>  12, +12, -12,                   # [+-]? [0-9]+
>         13.0, +13.0, -13.0,             # [+-]? [0-9]* '.' [0-9]+ with *=2
>         .0, +.0, -.0,                   # [+-]? [0-9]* '.' [0-9]+ with *=0
>         14.E0, +14.E0, -14.E0,          # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=0
>         14.0E0, +14.0E0,                # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=1
>         .14E2, +.14E2, -.14E2, -14.0E0, # [+-]? '.' [0-9]+ EXPONENT
>         1.4E1, +1.4E1, -1.4E1,          # [+-]? [0-9]+ EXPONENT)
>         14e0, 14e+0, 14e-0              # [eE] [+-]? [0-9]+
> ].
> [[
>
>
>> Also please make sure updates to the grammar are also checked into the
>> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/turtle.bnf not
>> only the HTML.
>
> will do.
Received on Friday, 15 June 2012 19:14:04 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:49 GMT