- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 29 Jun 2017 12:17:34 +0200
- To: "Peter F. Patel-Schneider" <peter.patel-schneider@nuance.com>
- Cc: public-rdf-comments@w3.org
- Message-Id: <F78AEA24-2CF0-412E-9997-B30CFEF67CBF@w3.org>
Peter,
I have added this to the official Errata list:
https://www.w3.org/2001/sw/wiki/RDF1.1_Errata
Thanks
Ivan
> On 29 Jun 2017, at 11:15, Peter F. Patel-Schneider <peter.patel-schneider@nuance.com> wrote:
>
> A message to semantic-web@w3.org
> https://lists.w3.org/Archives/Public/semantic-web/2017Jun/0065.html inspired
> me to take a closer look at the grammar for N-Triples. I found a number of
> problems in the grammar for N-Triples there. I propose the following fixed
> version of the grammar section.
>
>
> Problems addressed:
> 1/ White space permitted but not required between any two terminals and at
> beginning and end of document.
> 2/ Comments can only occur in specific places.
> 3/ Lines consisting entirely of white space and/or a comment are permitted.
> 4/ Confusing statement about Unicode code points removed.
>
> Remaining issue:
> 1/ The grammar in the TR mentions white space in the context of any two
> terminals, which includes between the parts of a literals. However, there
> is no example or test case that has white space there. This grammar
> permits white space there.
>
>
> 7. Grammar
>
> An N-Triples document is a Unicode [UNICODE] character string encoded in
> UTF-8.
> [[Remove: Unicode code points only in the range U+0 to U+10FFFF inclusive are
> allowed. Rationale: These are the only Unicode code points.]]
>
> White space (tab U+0009 or space U+0020) is allowed but not required between
> any two terminals.
> [[Replace: White space (tab U+0009 or space U+0020) is used to separate two
> terminals
> which would otherwise be (mis-)recognized as one terminal.
> Rationale: In N-Triples there is no possibility of such mis-recognition.]]
> White space is significant in the production STRING_LITERAL_QUOTE.
>
> Comments in N-Triples take the form of '#', outside an IRIREF or
> STRING_LITERAL_QUOTE, and continue up-to, and excluding, the end of line
> (EOL), or end of file if there is no end of line after the comment
> marker. Comments are treated as white space.
>
> The EBNF used here is defined in XML 1.0 [EBNF-NOTATION].
>
> [[White space and comments are now explicit in the grammar similar to the
> situation in early versions of the N-Triples grammar. Rationale: Makes it
> clear where white space and comments are permitted. ]]
>
> Escape sequence rules are the same as Turtle [TURTLE]. However, as only the
> STRING_LITERAL_QUOTE production is allowed new lines in literals MUST be
> escaped.
> [1] ntriplesDoc ::= triple? (EOL triple)* END
> [2] triple ::= WS? subject WS? predicate WS? object WS? '.'
> [3] subject ::= IRIREF | BLANK_NODE_LABEL
> [4] predicate ::= IRIREF
> [5] object ::= IRIREF | BLANK_NODE_LABEL | literal
> [6] literal ::= STRING_LITERAL_QUOTE (WS? '^^' WS? IRIREF | WS? LANGTAG)?
>
> Productions for terminals
> [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
> [[Lines consisting entirely of white space and/or a comment are now permitted.]]
> [7] EOL ::= ( WS? ('#x22' [^#xD#xA]* )? [#xD#xA] )+
> [7a] END ::= EOL? WS? ('#x22' [^#xD#xA]* )?
> [8] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
> [9] STRING_LITERAL_QUOTE ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
> [141s] BLANK_NODE_LABEL ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')*
> PN_CHARS)?
> [10] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
> [153s] ECHAR ::= '\' [tbnrf"'\]
> [157s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6]
> | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
> [158s] PN_CHARS_U ::= PN_CHARS_BASE | '_' | ':'
> [160s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] |
> [#x203F-#x2040]
> [162s] HEX ::= [0-9] | [A-F] | [a-f]
>
> [[White space is included in grammar.]]
> WS ::= [#x9#x20]+
>
>
>
>
----
Ivan Herman, W3C
Publishing@W3C Technical Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Thursday, 29 June 2017 10:17:53 UTC