- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 29 Jun 2017 12:17:34 +0200
- To: "Peter F. Patel-Schneider" <peter.patel-schneider@nuance.com>
- Cc: public-rdf-comments@w3.org
- Message-Id: <F78AEA24-2CF0-412E-9997-B30CFEF67CBF@w3.org>
Peter, I have added this to the official Errata list: https://www.w3.org/2001/sw/wiki/RDF1.1_Errata Thanks Ivan > On 29 Jun 2017, at 11:15, Peter F. Patel-Schneider <peter.patel-schneider@nuance.com> wrote: > > A message to semantic-web@w3.org > https://lists.w3.org/Archives/Public/semantic-web/2017Jun/0065.html inspired > me to take a closer look at the grammar for N-Triples. I found a number of > problems in the grammar for N-Triples there. I propose the following fixed > version of the grammar section. > > > Problems addressed: > 1/ White space permitted but not required between any two terminals and at > beginning and end of document. > 2/ Comments can only occur in specific places. > 3/ Lines consisting entirely of white space and/or a comment are permitted. > 4/ Confusing statement about Unicode code points removed. > > Remaining issue: > 1/ The grammar in the TR mentions white space in the context of any two > terminals, which includes between the parts of a literals. However, there > is no example or test case that has white space there. This grammar > permits white space there. > > > 7. Grammar > > An N-Triples document is a Unicode [UNICODE] character string encoded in > UTF-8. > [[Remove: Unicode code points only in the range U+0 to U+10FFFF inclusive are > allowed. Rationale: These are the only Unicode code points.]] > > White space (tab U+0009 or space U+0020) is allowed but not required between > any two terminals. > [[Replace: White space (tab U+0009 or space U+0020) is used to separate two > terminals > which would otherwise be (mis-)recognized as one terminal. > Rationale: In N-Triples there is no possibility of such mis-recognition.]] > White space is significant in the production STRING_LITERAL_QUOTE. > > Comments in N-Triples take the form of '#', outside an IRIREF or > STRING_LITERAL_QUOTE, and continue up-to, and excluding, the end of line > (EOL), or end of file if there is no end of line after the comment > marker. Comments are treated as white space. > > The EBNF used here is defined in XML 1.0 [EBNF-NOTATION]. > > [[White space and comments are now explicit in the grammar similar to the > situation in early versions of the N-Triples grammar. Rationale: Makes it > clear where white space and comments are permitted. ]] > > Escape sequence rules are the same as Turtle [TURTLE]. However, as only the > STRING_LITERAL_QUOTE production is allowed new lines in literals MUST be > escaped. > [1] ntriplesDoc ::= triple? (EOL triple)* END > [2] triple ::= WS? subject WS? predicate WS? object WS? '.' > [3] subject ::= IRIREF | BLANK_NODE_LABEL > [4] predicate ::= IRIREF > [5] object ::= IRIREF | BLANK_NODE_LABEL | literal > [6] literal ::= STRING_LITERAL_QUOTE (WS? '^^' WS? IRIREF | WS? LANGTAG)? > > Productions for terminals > [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* > [[Lines consisting entirely of white space and/or a comment are now permitted.]] > [7] EOL ::= ( WS? ('#x22' [^#xD#xA]* )? [#xD#xA] )+ > [7a] END ::= EOL? WS? ('#x22' [^#xD#xA]* )? > [8] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>' > [9] STRING_LITERAL_QUOTE ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"' > [141s] BLANK_NODE_LABEL ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* > PN_CHARS)? > [10] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX > [153s] ECHAR ::= '\' [tbnrf"'\] > [157s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] > | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | > [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | > [#xFDF0-#xFFFD] | [#x10000-#xEFFFF] > [158s] PN_CHARS_U ::= PN_CHARS_BASE | '_' | ':' > [160s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | > [#x203F-#x2040] > [162s] HEX ::= [0-9] | [A-F] | [a-f] > > [[White space is included in grammar.]] > WS ::= [#x9#x20]+ > > > > ---- Ivan Herman, W3C Publishing@W3C Technical Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Thursday, 29 June 2017 10:17:53 UTC