Review of N-triples in test cases from Graham Klyne on 2002-11-10 (w3c-rdfcore-wg@w3.org from November 2002)

From: Graham Klyne <GK@NineByNine.org>
Date: Sun, 10 Nov 2002 10:31:43 +0000
To: RDF core WG <w3c-rdfcore-wg@w3.org>
Message-Id: <5.1.0.14.2.20021110101530.00acc0e0@127.0.0.1>
I've just looked through the N-triples section of the test cases draft 
[1].  I think it's fine to publish as WD.  I do have some small comments 
for consideration,

[1] http://www.w3.org/2001/08/rdf-test/#ntriples

...

Section 3:
[[
NOTE: N-Triples is not an user RDF syntax - it is intended for RDF Core WG 
testing purposes and checking RDF applications for conformance with the 
specifications.
]]

I'm not sure what is meant here by "user RDF syntax" -- I could claim the 
same applies to RDF/XML.  Suggest something like:
[[
N-Triples is not recommended for general exchange of RDF data - ...
]]

...

Section 3.1:

[[
language ::= ( character - ( '.' | ws ) )+
and containing any allowed xml:lang content as defined in
http://www.w3.org/TR/REC-xml#sec-lang-tag
]]

The syntax production here is very generalized compared with the RFC3066 
syntax productions (and RFC1766 before it, as cited by the cited 
document).  I suggest either:
(a) don't give any syntax production, just cite the REC-xml section, OR
(b) give a syntax that matches the RFC3066 production, which in ABNF is:
[[
    The language tag is composed of one or more parts: A primary language
    subtag and a (possibly empty) series of subsequent subtags.

    The syntax of this tag in ABNF [RFC 2234] is:

     Language-Tag = Primary-subtag *( "-" Subtag )

     Primary-subtag = 1*8ALPHA

     Subtag = 1*8(ALPHA / DIGIT)

    The productions ALPHA and DIGIT are imported from RFC 2234; they
    denote respectively the characters A to Z in upper or lower case and
    the digits from 0 to 9.  The character "-" is HYPHEN-MINUS (ABNF:
    %x2D).
]]
-- http://www.ietf.org/rfc/rfc3066.txt

(I'm not sure offhand if the XML syntax notation can do the counted 
sequence productions.)

...

Section 3.2:

Given the restricted use for N-triples, per note in section 3, this 
paragraph seems out of place:
[[
It is recommended but not required that the resulting Unicode character 
string be made available to applications in UTF-8 encoding.
]]
I suggest dropping it.

...

Section 3.3:

This read oddly to me:
[[
URI references are sequences of US-ASCII character productions as defined 
in [RFC 2396] or must result in a URI reference after the standard escaping 
procedure is applied. The procedure is applied when passing the URI 
reference to a URI resolver. The standard escaping procedure is described 
in [RFC 2396] using UTF-8 as the character encoding.
]]

Hmmm... when does N-triples call for a URIref to be passed to a resolver?

I'm not sure exactly what is intended here, but I think it was something 
like this:
[[
URI references are sequences of US-ASCII character productions as defined 
in [RFC 2396] for a URI character sequence.  Where the original URIref 
contains characters not allowed in such a sequence, the standard escaping 
procedure described in [RFC 2396] using UTF-8 as the character encoding is 
applied, using UTF-8 as the character encoding.
]]

...

#g


-------------------
Graham Klyne
<GK@NineByNine.org>
Received on Sunday, 10 November 2002 05:30:27 UTC