- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sat, 16 Jun 2012 13:59:22 -0400
- To: Gavin Carothers <gavin@carothers.name>
- Cc: Gregg Kellogg <gregg@kellogg-assoc.com>, RDF-WG WG <public-rdf-wg@w3.org>
* Gavin Carothers <gavin@carothers.name> [2012-06-16 09:37-0700]
> Now on list to elicit more feedback.
>
> On Sat, Jun 16, 2012 at 7:20 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
> > typo, perhaps:
> > -[12] object ::= iri | blank | predicateObjectList | literal
> > +[12] object ::= iri | blank | blankNodePropertyList | literal
> >
> > string misallignment:
> > -[155s] STRING_LITERAL1 ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
> > -[156s] STRING_LITERAL2 ::= "'" ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* "'"
> > -[157s] STRING_LITERAL_LONG1 ::= "'''" (("'" | "''")? [^'\] | ECHAR | UCHAR)* "'''"
> > -[158s] STRING_LITERAL_LONG2 ::= '"""' (('"' | '""')? [^"\] | ECHAR | UCHAR)* '"""'
>
> Okay, now I'm just going crazy. That's the way there were BEFORE when
> someone said they were reversed.
There have been three changes of late:
1 align the LITERAL1/2 with single/double quote in SPARQL.
2 make sure that the excluded characters #x22 and #x27 correspond to the single and double quote respectively.
3 preserve a grouping around "[^"\] | ECHAR | UCHAR"
> > +[155s] STRING_LITERAL1 ::= "'" ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* "'"
> > +[156s] STRING_LITERAL2 ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
> > +[157s] STRING_LITERAL_LONG1 ::= "'''" (("'" | "''")? ([^'\] | ECHAR | UCHAR))* "'''"
> > +[158s] STRING_LITERAL_LONG2 ::= '"""' (('"' | '""')? ([^"\] | ECHAR | UCHAR))* '"""'
>
> No, that can't be right. Those are aren't what is in there now. The
> current grammar has [23] [24] numbered productions. Check
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/turtle.bnf
> before I go totally crazy (perhaps we shouldn't call them 1,2 and make
> clearer what's going on here as you, me, Andy, and Greg Kellogg all
> seem to have at one point or another confused this.
>
> >
> > two \s in IRIREF:
> > -[138s] IRIREF ::= '<' ([^#x00-#x20<>\"{}|^`\] | UCHAR)* '>'
> > +[138s] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>' # no UCHAR in SPARQL
>
> Changed!
>
> >
> > simplification and whitespace:
> > -[24] DECIMAL ::= [+-]? ([0-9]* '.' [0-9]+)
> > +[24] DECIMAL ::= [+-]? [0-9]* '.' [0-9]+
> > -[168s] PN_LOCAL ::= (PN_CHARS_U | [0-9] | PLX) ((PN_CHARS | '.' | PLX)* PN_CHARS | PLX)?
> > +[168s] PN_LOCAL ::= (PN_CHARS_U | [0-9] | PLX) ((PN_CHARS | '.' | PLX)* (PN_CHARS | PLX))?
>
> These are hopeless and the result of the method being used in bnf2html
> unless you have a VERY VERY strong opinion I'm going to leave these
> alone as last time I tried to fix them I broke most of the other
> nesting/precedence rules. They are correct but have slightly too many
> ()s
fair enough. we can tweak them by hand for PR and REC.
> > the usual prefix/base thing:
> > -[4] prefixID ::= '@prefix' PNAME_NS IRIREF
> > +[4] prefixID ::= PREFIX PNAME_NS IRIREF
> > -[5] base ::= '@base' IRIREF
> > +[5] base ::= BASE IRIREF
> > -[128s] RDFLiteral ::= String (LANGTAG | '^^' iri)?
> > +[17] RDFLiteral ::= String (LanguageTag | '^^' iri)?
> > +[18] LanguageTag ::= BASE | PREFIX | LANGTAG
> > +[20] BASE ::= '@base'
> > +[21] PREFIX ::= '@prefix'
>
> Ugh, I'm not sure this is any better. This clearly doesn't solve the
> issue as this is what we had before and what Greg used to create the
> RDF.rb turtle parser, which didn't work correctly :( Also the same as
> what was used to create Raptor which again has the same issue. Need to
> be clearer somehow on what should happen with "literal"@base and
> "literal"@prefix.
I believe that
[18] LanguageTag ::= BASE | PREFIX | LANGTAG
makes it explicit that the tokens for BASE and PREFIX can be interpreted as a language tag, if that it indeed our intention.
<http://w3.org/brief/MjY2> shows a standard lexer returning tokens for "@base" and "@prefix" and the parser accepting them as directives and as language tags.
--
-ericP
Received on Saturday, 16 June 2012 17:59:53 UTC