Re: PROPOSED to RESOLVE ISSUE-127 with Canonical N-Triples

On Jul 5, 2013, at 9:26 PM, Gavin Carothers <> wrote:

> states that the new N-Triples specification doesn't provide for the old functionality of a given triple having one and only one way to write it down. The current draft of N-Triples has added a Canonical N-Triples definition to the conformance section. 
> A canonical N-Triple document is a N-Triple document with additional constraints:
> 	 Space between terms (WS+) SHOULD be a single space, (U+0020).
> 	 Space after or before terms (WS*) SHOULD be empty.
> 	 HEX SHOULD use only uppercase letters ([A-F]).
> 	 Characters not allowed directly in STRING_LITERAL_QUOTE (U+0022, U+005C, U+000A, U+000D) SHOULD use ECHAR not UCHAR.
> 	 Characters SHOULD be represented directly and not by UCHAR.
> This is NOT the same as the current definition in RDF Test Cases as it prefers the direct representation of characters over the use of escape sequences. It also specifies the white space rules.

I am not satisfied with this as a resolution to my issue. Having this "canonical N-Triples" variant does nothing to address my comment that the new draft has made significant changes to N-Triples, and many of these introduce multiple ways to encode a given N-Triples graph. As N-Triples is an established format in widespread use, I consider these changes ill-advised and see no actual value in changing the format to support them.

I would like to see the N-Triples grammar reverted to its previous form where there were essentially no choices left to implementations in how to serialize a graph. If anything, I would think a "canonical N-Triples" constraint on the original (RDF Test Cases) grammar that tightened the allowable use of whitespace would be better.


Received on Friday, 5 July 2013 21:39:49 UTC