Omissions, Errors, and Misleading Prose in the N-Triples Specification

Date: Fri, 5 Sep 2003
All section references are to rdf-testcases [1].

1) Section 3.2: "[t]he characters outside the US-ASCII range are made
available by \-escape sequences as follows". However, some of the
characters in the table are *inside* the US-ASCII range; i.e. #x5C,
#x22, #x0A, #x0D, and #x09.

2) It is not clear that #x0A, #x0D, and #x09 need to be encoded,
except that they are not allowed in the character production of
section 3.1.

3) #x5C and #x22 (backslash and quote marks) are not disallowed from
strings by the grammar, and there is no clear prose that disallows
them either. Therefore, it is not stated that they are to be encoded
within literals. This means that "\x" is a valid N-Triples literal,
and "\" and """ are very ambiguous, and possibly valid.

4) "#X5C" in the table in section 3.2 should be "#x5C".

5) Since, as stated in section 3.1, the employed "EBNF cannot perform
the counting required by the Primary-subtag and Subtag productions",
perhaps it would be useful to either a) switch to an EBNF that *can*
perform the counting, or b) note the counting in prose, and state
whether conformant N-Triples parsers are required to perform such

6) Conformance levels are not clearly specified. Does a conformant
N-Triples parser have to fully check URI syntax, for example?
Primary-subtag and Subtag counting?

7) It is not clear that the absoluteURI production in N-Triples
exactly matches (or imports) the absoluteURI production from RFC 2396,
though the RFC is cited.

8) Section 3.3: "[c]haracters above the US-ASCII range are made
available by the \u or \U escapes". I am aware that this has been
raised before, but this section should be removed, and UTF-8 + %HH
encoding or non-US-ASCII characters used for synchronicity with the
IRI mechanism (being employed in, e.g., XPointer, XInclude, and XML

9) Please indicate whether or not a charset parameter may or must not
be used in conjunction with the text/plain MIME type, since according
to section 3.1 the only allowed encoding is us-ascii.

Note that many of the comments above are based on implementor
experience, in building a Python RDF API that includes N-Triples


[1] http://www.w3.org/TR/rdf-testcases/
- W3C Working Draft 23 January 2003

