- From: Gavin Carothers <gavin@carothers.name>
- Date: Fri, 18 Oct 2013 07:50:09 -0700
- To: Gregory Williams <greg@evilfunhouse.com>
- Cc: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
- Message-ID: <CAPqY83z0DjkBUgw06=9Qz0M+Au3=tHQtaeW_kt4dshjB_uO6ag@mail.gmail.com>
On Fri, Jul 5, 2013 at 2:34 PM, Gregory Williams <greg@evilfunhouse.com>wrote: > > On Jul 5, 2013, at 9:26 PM, Gavin Carothers <gavin@carothers.name> wrote: > > > https://www.w3.org/2011/rdf-wg/track/issues/127 states that the new > N-Triples specification doesn't provide for the old functionality of a > given triple having one and only one way to write it down. The current > draft of N-Triples has added a Canonical N-Triples definition to the > conformance section. > > > > A canonical N-Triple document is a N-Triple document with additional > constraints: > > > > > > • Space between terms (WS+) SHOULD be a single space, (U+0020). > > • Space after or before terms (WS*) SHOULD be empty. > > • HEX SHOULD use only uppercase letters ([A-F]). > > • Characters not allowed directly in STRING_LITERAL_QUOTE (U+0022, > U+005C, U+000A, U+000D) SHOULD use ECHAR not UCHAR. > > • Characters SHOULD be represented directly and not by UCHAR. > > This is NOT the same as the current definition in RDF Test Cases as it > prefers the direct representation of characters over the use of escape > sequences. It also specifies the white space rules. > > I am not satisfied with this as a resolution to my issue. Having this > "canonical N-Triples" variant does nothing to address my comment that the > new draft has made significant changes to N-Triples, and many of these > introduce multiple ways to encode a given N-Triples graph. As N-Triples is > an established format in widespread use, I consider these changes > ill-advised and see no actual value in changing the format to support them. > > I would like to see the N-Triples grammar reverted to its previous form > where there were essentially no choices left to implementations in how to > serialize a graph. If anything, I would think a "canonical N-Triples" > constraint on the original (RDF Test Cases) grammar that tightened the > allowable use of whitespace would be better. > Gregory, Thank you again for your comments on N-Triples. This is the second formal response to issue http://www.w3.org/2011/rdf-wg/track/issues/127 N-Triples was originally created as part of the RDF Test Cases. As such it included: N-Triples is an RDF syntax for expressing RDF test cases and defining the correspondence between RDF/XML and the RDF abstract syntax. RDF/XML [RDF-SYNTAX] is the recommended syntax for applications to exchange RDF information. It also did not have a distinct media type, and was recommended only for test cases. As such it did not have any internationalization requirements placed on it. Also the world has changed since 2001 when it was decided that N-Triples should be ASCII and not UTF-8. RDF Test Cases N-Triples requires the following: <http://example.org/> <http://example.org/property> "I\u00F1t\u00EBrn\u00E2ti\u00F4n\u00E0liz\u00E6ti\u00F8n" . N-Triples REC track allows and recommends: <http://example.org/> <http://example.org/property> "Iñtërnâtiônàlizætiøn". While the first was totally acceptable for a test case format, it is not acceptable for use as a wide spread data exchange format. In order to address internationalization concerns and adopt the practice of existing implementations in the wild N-Triples is now allowed and recommended to be UTF-8 while continuing to support data using \u \U escapes. In modern systems "Iñtërnâtiônàlizætiøn" is greatly preferred by users for interoperability and ease of use over "I\u00F1t\u00EBrn\u00E2ti\u00F4n\u00E0liz\u00E6ti\u00F8n". Your comment also touches on requirements for serializes. The N-Triples REC track document places no conformance constants on a serializer, instead it defines two classes of documents a "canonical N-Triples document" and a "N-Triple document". Canonical was added specifically to address your comment regrading the need for a recommended way to write down a given triple while also meeting the new requirements around internationalization. At the same time a seralizer that produces Test Cases N-Triples will produce a conforming N-Triple document. Please reply to public-rdf-comments@w3.org indicating whether this relational explains the Working Groups decision to allow and recommend the use of UTF-8 for N-Triples. Sincerely, Gavin Carothers
Received on Friday, 18 October 2013 14:50:40 UTC