W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2001

N-triples (1.4)

From: Graham Klyne <GK@NineByNine.org>
Date: Wed, 18 Jul 2001 18:53:33 +0100
Message-Id: <5.1.0.14.2.20010718183345.00a04930@joy.songbird.com>
To: RDF core WG <w3c-rdfcore-wg@w3.org>
Re: http://www.w3.org/2001/sw/RDFCore/ntriples/


1. Is it intended that a name can start with a digit?
   name ::= [A-Za-z0-9]+

As far as I can tell, it's used only in anonNode


2. I think this violates the syntax notation:
   absoluteURI ::= [^ < > space]+

As far as I can tell, http://www.w3.org/TR/REC-xml#sec-notation allows only 
literal characters or #xN character codes within the [...] construct.  I 
suggest:
   absoluteURI ::= ( [^<>] - space )+


3. Outstanding issue 1

N-Triples is a text/plain MIME type format - consider character and 
encoding issues with requirement to be able to express all Unicode chars.

That much is easy, I think.  Use:
   Content-type: text/plain;charset=utf-8


4. Outstanding issue 2

Consider adding \#xHEX escaping to allow N-Triples to encode Unicode 
characters in text/plain.  Or after Python: \uxxxx and \Uxxxxxxxx

I suppose that, for completeness, we must.  Escaping is a topic that can be 
far trickier than it seems it should be for such a "simple" 
purpose.  Following your Python reference, I see they've tightened up the 
Python spec.  I think these are probably the right choices for us:
   \uxxxx      Character with 16-bit hex value xxxx (Unicode only)
   \Uxxxxxxxx  Character with 32-bit hex value xxxxxxxx (Unicode only)

One might also consider allowing:
   \xhh        ASCII character with hex value hh
for 'hh' in the range '00' to '7F' (i.e. where Unicode code points match 
US-ASCII).


5. eoln format

Being a MIME text/plain format, the cr in eoln should not be optional:
   eoln ::= cr? lf
should read:
   eoln ::= cr lf


#g


------------
Graham Klyne
(GK@ACM.ORG)
Received on Wednesday, 18 July 2001 14:07:22 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:38:12 EDT