- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 15 Jun 2012 16:29:23 -0400
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: Gavin Carothers <gavin@carothers.name>, public-rdf-wg@w3.org
* Andy Seaborne <andy.seaborne@epimorphics.com> [2012-06-15 20:35+0100] > Eric: > > The problem with your way is that > > [22] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* > > includes "@base" and "@prefix" already In order to match [[ "@base" IRIREF "." ]], I need to create some lexical token. Suppose I implement it like so: __BASE= '@' 'b' 'a' 's' 'e' and I put that before LANGTAG in a lex file, I'll match __BASE instead of LANGTAG. I'll never parse "a"@base . If I re-order them, the parser will never see a LANGTAG token. > Andy > > On 15/06/12 20:13, Andy Seaborne wrote: > >I prefer Gavin's approach. > > > >No BASE PREFIX; Put '@base' and '@prefix' in the directives. > > > >http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0353.html > > > >(and it works in parser generators I have used) > > > >Andy > > > >On 15/06/12 19:56, Eric Prud'hommeaux wrote: > >>* Gavin Carothers<gavin@carothers.name> [2012-06-15 10:44-0700] > >>>On Fri, Jun 15, 2012 at 9:48 AM, Eric Prud'hommeaux<eric@w3.org> wrote: > >>>>+[20] LANGTAG ::= BASE | PREFIX | '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* > >>> > >>> > >>>No, reverting back to the PREFIX BASE terminals is not acceptable. > >>>This was already the subject of review by Andy and Peter. > >>> > >>>Please see thread > >>>http://lists.w3.org/Archives/Public/public-rdf-wg/2012May/0347.html > >>>for discussion on the change from PREFIX BASE to a simpler LANGTAG. > >> > >>But that thread didn't terminate in consensus. > >>Andy's point > >>[[ > >>(to the casual reader : BASE is '@base' and PREFIX is '@prefix' > >> > >>Which is ambiguous - as it says: > >> > >>LANGTAG ::= ('@base' | '@prefix' | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+) > >> > >>so the string "@base" matches two ways. > >> > >>But even if sorted out ... it means a tokenizer may well generate the > >>token LANGTAG ... and then: > >> > >>[5] base ::= BASE IRIREF > >> > >>does not match as the token is LANGTAG, not BASE. Oops. > >>]] > >> > >>is addressed by moving the "BASE | PREFIX | " from LANGTAG to RDFLiteral: > >> > >>RDFLiteral ::= String (BASE | PREFIX | LANGTAG | '^^' iri)? > >> > >>Turtle doesn't talk about parsing rules (perhaps it should); SPARQL's > >>note 3 says [[ > >>When tokenizing the input and choosing grammar rules, the longest > >>match is chosen. > >>]] —<http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#sparqlGrammar> > >> > >>This doesn't establish a relative order between terminals implied by > >>""'d strings in the productions vs. explicit terminals like "LANGTAG > >>::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*". After failing a few tests, > >>people would likely add an order to make "@base" and "@prefix" parse > >>as implicit terminals and never parse them as language tags. We can be > >>much more explicit if use the above production for RDFLiteral. An > >>aesthetic option would be to break it up for semantic clarity: > >> > >>RDFLiteral ::= String (LanguageTag | '^^' iri)? > >>LanguageTag ::= BASE | PREFIX | LANGTAG > >> > >>I've commited that for everyone's viewing pleasure. > >> > >>I also found some errors in STRING_LITERAL ("s vs. 's reverse, so 's > >>not allowed within "" string). I'm now validating with this text (note > >>the long quotes): > >>[[ > >>[]<p> <o1>, "o2", [<p2> _:o3 ] ; > >><p3> (<o4> "o5"@base "o5"@prefix _:o6 [<p4> <o8> ] ),<o9> . > >>[<p5> """o10 > >>""line"" '''2'''""", '''o11 > >>''line'' """3"""'''^^<integer> ; > >><p6> 12, +12, -12, # [+-]? [0-9]+ > >>13.0, +13.0, -13.0, # [+-]? [0-9]* '.' [0-9]+ with *=2 > >>.0, +.0, -.0, # [+-]? [0-9]* '.' [0-9]+ with *=0 > >>14.E0, +14.E0, -14.E0, # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=0 > >>14.0E0, +14.0E0, # [+-]? [0-9]+ '.' [0-9]* EXPONENT with *=1 > >>.14E2, +.14E2, -.14E2, -14.0E0, # [+-]? '.' [0-9]+ EXPONENT > >>1.4E1, +1.4E1, -1.4E1, # [+-]? [0-9]+ EXPONENT) > >>14e0, 14e+0, 14e-0 # [eE] [+-]? [0-9]+ > >>]. > >>[[ > >> > >> > >>>Also please make sure updates to the grammar are also checked into the > >>>http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/turtle.bnf not > >>>only the HTML. > >> > >>will do. -- -ericP
Received on Friday, 15 June 2012 20:29:54 UTC