Re: SPARQL and Turtle Prefix Placement

* Andy Seaborne <andy.seaborne@epimorphics.com> [2012-06-15 09:25+0100]
> 
> >btw, i've been updating the grammar to deal with some LL(1).LALR(1) and other conflicts. should be synched soon.
> 
> As this is very close to LC, could you point out the changes being made?

Indeed. There are three kinds of changes:
  1 get rid of extra ()s, à la "(statement)*"
  2 make explicit that turtle parses '"ab"@base' as a literal with a language tag.
  3 fix lalr(1)/ll(1) conflict in
    [6] triples ::= subject predicateObjectList | blankNodePropertyList predicateObjectList?
    by moving blankNodePropertyList from [14] blank to 12 [object].

3 is the biggest change, necessitated by the addtion of " | blankNodePropertyList predicateObjectList?" to [6] triples. I believe I properly chased down the grammar combos and tested them with <http://w3.org/brief/MjY0>, but I'd like a second.

[[
-[1]     turtleDoc              ::= (statement)*
+[1]     turtleDoc              ::= statement*
-[2]     statement              ::= (directive '.') | (triples '.')
+[2]     statement              ::= directive '.' | triples '.'
 [3]     directive              ::= prefixID | base
-[4]     prefixID               ::= '@prefix' PNAME_NS IRIREF
+[4]     prefixID               ::= PREFIX PNAME_NS IRIREF
-[5]     base                   ::= '@base' IRIREF
+[5]     base                   ::= BASE IRIREF
-[6]     triples                ::= (subject predicateObjectList) | (blankNodePropertyList (predicateObjectList)?)
+[6]     triples                ::= subject predicateObjectList | blankNodePropertyList predicateObjectList?
 [7]     predicateObjectList    ::= verb objectList (';' verb objectList)* (';')?
 [8]     objectList             ::= object (',' object)*
 [9]     verb                   ::= predicate | 'a'
 [10]    subject                ::= iri | blank
 [11]    predicate              ::= iri
-[12]    object                 ::= iri | blank | literal
+[12]    object                 ::= iri | blank | blankNodePropertyList | literal
 [13]    literal                ::= RDFLiteral | NumericLiteral | BooleanLiteral
-[14]    blank                  ::= BlankNode | blankNodePropertyList | collection
+[14]    blank                  ::= BlankNode | collection
 [15]    blankNodePropertyList  ::= '[' predicateObjectList ']'
-[16]    collection             ::= '(' (object)* ')'
+[16]    collection             ::= '(' object* ')'
-[60s]   RDFLiteral             ::= String (LANGTAG | ('^^' iri))?
+[60s]   RDFLiteral             ::= String (LANGTAG | '^^' iri)?
 [61s]   NumericLiteral         ::= NumericLiteralUnsigned | NumericLiteralPositive | NumericLiteralNegative
 [62s]   NumericLiteralUnsigned ::= INTEGER | DECIMAL | DOUBLE
 [63s]   NumericLiteralPositive ::= INTEGER_POSITIVE | DECIMAL_POSITIVE | DOUBLE_POSITIVE
@@ -24,24 +24,26 @@
 [67s]   iri                    ::= IRIREF | PrefixedName
 [68s]   PrefixedName           ::= PNAME_LN | PNAME_NS
 [69s]   BlankNode              ::= BLANK_NODE_LABEL | ANON
+[17]    BASE                   ::= '@base'
+[18]    PREFIX                 ::= '@prefix'
 [132s]  IRIREF                 ::= '<' ([^#x00-#x20<>\"{}|^`\\] | UCHAR)* '>'
-[133s]  PNAME_NS               ::= (PN_PREFIX)? ':'
+[133s]  PNAME_NS               ::= PN_PREFIX? ':'
 [134s]  PNAME_LN               ::= PNAME_NS PN_LOCAL
 [135s]  BLANK_NODE_LABEL       ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)?
-[19]    LANGTAG                ::= '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+)*
+[19]    LANGTAG                ::= BASE | PREFIX | '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
-[20]    INTEGER                ::= ([+-])? ([0-9])+
+[20]    INTEGER                ::= [+-]? [0-9]+
-[21]    DECIMAL                ::= ([+-])? (([0-9])* '.' ([0-9])+)
+[21]    DECIMAL                ::= [+-]? ([0-9]* '.' [0-9]+)
-[22]    DOUBLE                 ::= ([+-])? ((([0-9])+ '.' ([0-9])* EXPONENT) | ('.' ([0-9])+ EXPONENT) | (([0-9])+ EXPONENT))
+[22]    DOUBLE                 ::= [+-]? (([0-9]+ '.' [0-9]* EXPONENT) | ('.' [0-9]+ EXPONENT) | ([0-9]+ EXPONENT))
-[148s]  EXPONENT               ::= [eE] ([+-])? ([0-9])+
+[148s]  EXPONENT               ::= [eE] [+-]? [0-9]+
 [149s]  STRING_LITERAL1        ::= '"' ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* '"'
 [150s]  STRING_LITERAL2        ::= "'" ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* "'"
 [151s]  STRING_LITERAL_LONG1   ::= "'''" (("'" | "''")? ([^'\] | ECHAR | UCHAR))* "'''"
 [152s]  STRING_LITERAL_LONG2   ::= '"""' (('"' | '""')? ([^"\] | ECHAR | UCHAR))* '"""'
-[19]    UCHAR                  ::= ('\u' HEX HEX HEX HEX) | ('\U' HEX HEX HEX HEX HEX HEX HEX HEX)
+[23]    UCHAR                  ::= ('\u' HEX HEX HEX HEX) | ('\U' HEX HEX HEX HEX HEX HEX HEX HEX)
 [153s]  ECHAR                  ::= '\' [tbnrf\"']
-[154s]  NIL                    ::= '(' (WS)* ')'
+[154s]  NIL                    ::= '(' WS* ')'
 [155s]  WS                     ::= #x20 | #x9 | #xD | #xA
-[156s]  ANON                   ::= '[' (WS)* ']'
+[156s]  ANON                   ::= '[' WS* ']'
 [157s]  PN_CHARS_BASE          ::= [A-Z] | [a-z] | [#00C0-#00D6] | [#00D8-#00F6] | [#00F8-#02FF] | [#0370-#037D] | [#037F-#1FFF] | [#200C-#200D] | [#2070-#218F] | [#2C00-#2FEF] | [#3001-#D7FF] | [#F900-#FDCF] | [#FDF0-#FFFD] | [#10000-#EFFFF]
 [158s]  PN_CHARS_U             ::= PN_CHARS_BASE | '_' | ':'
 [160s]  PN_CHARS               ::= PN_CHARS_U | '-' | [0-9] | #00B7 | [#0300-#036F] | [#203F-#2040]
]]

I haven't changed
-[22]    DOUBLE                 ::= [+-]? (([0-9]+ '.' [0-9]* EXPONENT) | ('.' [0-9]+ EXPONENT) | ([0-9]+ EXPONENT)) to
+[22]    DOUBLE                 ::= [+-]? ( [0-9]+ '.' [0-9]* EXPONENT  |  '.' [0-9]+ EXPONENT  |  [0-9]+ EXPONENT )
'cause I wasn't sure if others found the former more readable (though I personally prefer fewer ()s (they get a bit oppressive (when used in excess))).


> This does not make it very easy to see any material changes:
> http://dvcs.w3.org/hg/rdf/rev/8b47a7006c8c
> 
> The hg log is to changes of the HTML and it's very hard to see the
> real changes when it has:
> 
>  1.7 -    <td>[1]<td>
>  1.8 -    <td><code>turtleDoc</code><td>
>  1.9 +    <td>[1]</td>
> 1.10 +    <td><code>turtleDoc</code></td>
> 
> - - - - - - - - - - - - -
> 
> I noticed there are 2 * 17's:
> 
> [17]  BASE  ::=  '@base'
> [17]  PREFIX  ::=  '@prefix'
> 
>  Thanks
>  Andy
> 
> 

-- 
-ericP

Received on Friday, 15 June 2012 09:33:14 UTC