W3C home > Mailing lists > Public > public-rdf-comments@w3.org > June 2017

Re: Proposed fixed version of N-Triples https://www.w3.org/TR/n-triples/ Section 7

From: Ivan Herman <ivan@w3.org>
Date: Thu, 29 Jun 2017 12:17:34 +0200
Cc: public-rdf-comments@w3.org
Message-Id: <F78AEA24-2CF0-412E-9997-B30CFEF67CBF@w3.org>
To: "Peter F. Patel-Schneider" <peter.patel-schneider@nuance.com>
Peter,

I have added this to the official Errata list:

https://www.w3.org/2001/sw/wiki/RDF1.1_Errata

Thanks

Ivan


> On 29 Jun 2017, at 11:15, Peter F. Patel-Schneider <peter.patel-schneider@nuance.com> wrote:
> 
> A message to semantic-web@w3.org
> https://lists.w3.org/Archives/Public/semantic-web/2017Jun/0065.html inspired
> me to take a closer look at the grammar for N-Triples.  I found a number of
> problems in the grammar for N-Triples there.  I propose the following fixed
> version of the grammar section.
> 
> 
> Problems addressed:
> 1/ White space permitted but not required between any two terminals and at
> beginning and end of document.
> 2/ Comments can only occur in specific places.
> 3/ Lines consisting entirely of white space and/or a comment are permitted.
> 4/ Confusing statement about Unicode code points removed.
> 
> Remaining issue:
> 1/ The grammar in the TR mentions white space in the context of any two
> terminals, which includes between the parts of a literals.  However, there
> is no example or test case that has white space there.   This grammar
> permits white space there.
> 
> 
> 7. Grammar
> 
> An N-Triples document is a Unicode [UNICODE] character string encoded in
> UTF-8.
> [[Remove: Unicode code points only in the range U+0 to U+10FFFF inclusive are
> allowed.  Rationale: These are the only Unicode code points.]]
> 
> White space (tab U+0009 or space U+0020) is allowed but not required between
> any two terminals.
> [[Replace: White space (tab U+0009 or space U+0020) is used to separate two
> terminals
> which would otherwise be (mis-)recognized as one terminal.
> Rationale: In N-Triples there is no possibility of such mis-recognition.]]
> White space is significant in the production STRING_LITERAL_QUOTE.
> 
> Comments in N-Triples take the form of '#', outside an IRIREF or
> STRING_LITERAL_QUOTE, and continue up-to, and excluding, the end of line
> (EOL), or end of file if there is no end of line after the comment
> marker. Comments are treated as white space.
> 
> The EBNF used here is defined in XML 1.0 [EBNF-NOTATION].
> 
> [[White space and comments are now explicit in the grammar similar to the
> situation in early versions of the N-Triples grammar.  Rationale: Makes it
> clear where white space and comments are permitted. ]]
> 
> Escape sequence rules are the same as Turtle [TURTLE]. However, as only the
> STRING_LITERAL_QUOTE production is allowed new lines in literals MUST be
> escaped.
> [1] 	ntriplesDoc 	::= 	triple? (EOL triple)* END
> [2] 	triple	 	::= 	WS? subject WS? predicate WS? object WS? '.'
> [3] 	subject 	::= 	IRIREF | BLANK_NODE_LABEL
> [4] 	predicate 	::= 	IRIREF
> [5] 	object	 	::= 	IRIREF | BLANK_NODE_LABEL | literal
> [6] 	literal 	::= 	STRING_LITERAL_QUOTE (WS? '^^' WS? IRIREF | WS? LANGTAG)?
> 
> Productions for terminals
> [144s] 	LANGTAG 	::= 	'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
> [[Lines consisting entirely of white space and/or a comment are now permitted.]]
> [7] 	EOL 	::= 	( WS? ('#x22' [^#xD#xA]* )? [#xD#xA] )+
> [7a]	END	::= 	EOL? WS? ('#x22' [^#xD#xA]* )?
> [8] 	IRIREF 	::= 	'<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
> [9] 	STRING_LITERAL_QUOTE 	::= 	'"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
> [141s] 	BLANK_NODE_LABEL 	::= 	'_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')*
> PN_CHARS)?
> [10] 	UCHAR 	::= 	'\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
> [153s] 	ECHAR 	::= 	'\' [tbnrf"'\]
> [157s] 	PN_CHARS_BASE 	::= 	[A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6]
> | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
> [158s] 	PN_CHARS_U 	::= 	PN_CHARS_BASE | '_' | ':'
> [160s] 	PN_CHARS 	::= 	PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] |
> [#x203F-#x2040]
> [162s] 	HEX 	::= 	[0-9] | [A-F] | [a-f]
> 
> [[White space is included in grammar.]]
> 	WS	::=	[#x9#x20]+
> 
> 
> 
> 


----
Ivan Herman, W3C
Publishing@W3C Technical Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704


Received on Thursday, 29 June 2017 10:17:53 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 June 2017 10:17:53 UTC