- From: Gregory Williams <greg@evilfunhouse.com>
- Date: Mon, 15 Jul 2013 23:27:53 +0300
- To: Gavin Carothers <gavin@carothers.name>
- Cc: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
Hi Gavin. Thanks for the very quick response. Happy with the state of most of these now. A few comments on ones I think still need discussion below. On Jul 15, 2013, at 11:00 PM, Gavin Carothers <gavin@carothers.name> wrote: >> What is the rationale for a "canonical N-Triples" document encoding characters "directly and not by UCHAR"? This means that any existing N-Triples document that includes non-ASCII data is by definition not canonical, correct? >> >> What is the rationale for disallowing a space after the object of a triple? A much simpler, and more regular rule for serializers wishing to produce "canonical N-Triples" would be that the only use of the WS token should be a single space after every term. > > Simplicity of explaining in the current grammar. Was going for if the grammar requires some whitespace require it to be a space, otherwise require no whitespace. I think the rule is reasonably simple. The optionality of whitespace after object is in the original n-triples definition as well. FWIW, I found the current text regarding whitespace to be confusing, and had to read it several times to understand that it meant one space between s-p, and one between p-o, but none afterwards. E.g. the apparent conflict between "Space between terms (WS+) should be a single space" and "Space after or before terms (WS*) should be empty". I understand the trailing whitespace is optional in both this and the original N-Triples. I was hoping for insight into why "Canonical N-Triples" shouldn't just say "one space following every term" which I believe to be simpler both in describing the constrained grammar and in implementation. >> There should be another constraint on "canonical N-Triples" documents indicating when either of the two forms of UCHAR must be used. (Or, better, require *all* n-triples documents, whether canonical or not, to conform to such a constraint as the old RDF Test Cases N-Triples format did.) > > I'm sorry, I don't understand this comment. Is this addressing the capitalization of HEX? If so that's already mentioned. If it's about \u vs \U that's also addressed... ah, perhaps that's the issue? > Something along the lines of: > > [#x7F-#xFFFF] \uHHHH > 4 required hexadecimal digits HHHH encoding Unicode character u > [#10000-#x10FFFF] \UHHHHHHHH > 8 required hexadecimal digits HHHHHHHH encoding Unicode character u > > for serialization? Yes, the latter. The original N-Triples did this: you didn't have a choice between \u and \U forms. The codepoint value dictated which escape form had to be used. I remain convinced that the flexibility in the new n-triples is a terrible idea, but if it has to stay in, I think the "canonical N-Triples" definition must include a rule like this which constrains the choice of escape form. >> == A. N-Triples Internet Media Type, File Extension and Macintosh File Type >> >> Why is the new media type for N-Triples "application/n-triples" and not "text/n-triples"? This format is explicitly described as a "plain text format" in the abstract of the document. > > > Summary of WG discussion on the issue: > > * N-triples is less readable than Turtle and more directed to machine processing. > * text/* would default to ISO-8859-1 encoding, which is not the goal. It would default to that without the spec saying otherwise, but since the spec *does* say otherwise (in many places, but most relevantly in the "N-Triples Internet Media Type, File Extension and Macintosh File Type" section), I believe that should be enough per RFC 6657. > * application/* subtypes unknown to an implementation MUST be treated as binary data. > * Opening text/* in a browser causes it to be displayed, while opening application/* causes it to be downloaded. If the WG feels this is important, I guess I can understand that. I've always found text/* to be much easier to deal with as 1) it's trivial to force a link to download in a browser with an extra key-press and 2) it *allows* peeking inside the file in the browser if desired (which is often impossible with application/*). thanks, .greg
Received on Monday, 15 July 2013 20:28:22 UTC