- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Fri, 15 Jun 2012 20:35:14 +0100
- To: Eric Prud'hommeaux <eric@w3.org>
- CC: Gavin Carothers <gavin@carothers.name>, public-rdf-wg@w3.org
Eric: The problem with your way is that [22] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* includes "@base" and "@prefix" already Andy On 15/06/12 20:13, Andy Seaborne wrote: > I prefer Gavin's approach. > > No BASE PREFIX; Put '@base' and '@prefix' in the directives. > > http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0353.html > > (and it works in parser generators I have used) > > Andy > > On 15/06/12 19:56, Eric Prud'hommeaux wrote: >> * Gavin Carothers<gavin@carothers.name> [2012-06-15 10:44-0700] >>> On Fri, Jun 15, 2012 at 9:48 AM, Eric Prud'hommeaux<eric@w3.org> wrote: >>>> +[20] LANGTAG ::= BASE | PREFIX | '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* >>> >>> >>> No, reverting back to the PREFIX BASE terminals is not acceptable. >>> This was already the subject of review by Andy and Peter. >>> >>> Please see thread >>> http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0347.html >>> for discussion on the change from PREFIX BASE to a simpler LANGTAG. >> >> But that thread didn't terminate in consensus. >> Andy's point >> [[ >> (to the casual reader : BASE is '@base' and PREFIX is '@prefix' >> >> Which is ambiguous - as it says: >> >> LANGTAG ::= ('@base' | '@prefix' | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+) >> >> so the string "@base" matches two ways. >> >> But even if sorted out ... it means a tokenizer may well generate the >> token LANGTAG ... and then: >> >> [5] base ::= BASE IRIREF >> >> does not match as the token is LANGTAG, not BASE. Oops. >> ]] >> >> is addressed by moving the "BASE | PREFIX | " from LANGTAG to RDFLiteral: >> >> RDFLiteral ::= String (BASE | PREFIX | LANGTAG | '^^' iri)? >> >> Turtle doesn't talk about parsing rules (perhaps it should); SPARQL's >> note 3 says [[ >> When tokenizing the input and choosing grammar rules, the longest >> match is chosen. >> ]] —<http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#sparqlGrammar> >> >> This doesn't establish a relative order between terminals implied by >> ""'d strings in the productions vs. explicit terminals like "LANGTAG >> ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*". After failing a few tests, >> people would likely add an order to make "@base" and "@prefix" parse >> as implicit terminals and never parse them as language tags. We can be >> much more explicit if use the above production for RDFLiteral. An >> aesthetic option would be to break it up for semantic clarity: >> >> RDFLiteral ::= String (LanguageTag | '^^' iri)? >> LanguageTag ::= BASE | PREFIX | LANGTAG >> >> I've commited that for everyone's viewing pleasure. >> >> I also found some errors in STRING_LITERAL ("s vs. 's reverse, so 's >> not allowed within "" string). I'm now validating with this text (note >> the long quotes): >> [[ >> []<p> <o1>, "o2", [<p2> _:o3 ] ; >> <p3> (<o4> "o5"@base "o5"@prefix _:o6 [<p4> <o8> ] ),<o9> . >> [<p5> """o10 >> ""line"" '''2'''""", '''o11 >> ''line'' """3"""'''^^<integer> ; >> <p6> 12, +12, -12, # [+-]? [0-9]+ >> 13.0, +13.0, -13.0, # [+-]? [0-9]* '.' [0-9]+ with *=2 >> .0, +.0, -.0, # [+-]? [0-9]* '.' [0-9]+ with *=0 >> 14.E0, +14.E0, -14.E0, # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=0 >> 14.0E0, +14.0E0, # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=1 >> .14E2, +.14E2, -.14E2, -14.0E0, # [+-]? '.' [0-9]+ EXPONENT >> 1.4E1, +1.4E1, -1.4E1, # [+-]? [0-9]+ EXPONENT) >> 14e0, 14e+0, 14e-0 # [eE] [+-]? [0-9]+ >> ]. >> [[ >> >> >>> Also please make sure updates to the grammar are also checked into the >>> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/turtle.bnf not >>> only the HTML. >> >> will do.
Received on Friday, 15 June 2012 19:35:44 UTC