N-Triples Unicode Ambiguity

Agenda AOB material? Of can this be dealt with on-list?

----- Forwarded message from "Sean B. Palmer" <sean@mysterylights.com> -----

From: "Sean B. Palmer" <sean@mysterylights.com>
Date: Fri, 5 Sep 2003 02:36:13 +0100
To: RDF Comments <www-rdf-comments@w3.org>
Subject: N-Triples Unicode Ambiguity
Message-ID: <01cd01c3734e$183e8660$1054ff3e@z5n9x1>
Resent-From: www-rdf-comments@w3.org
Resent-Date: Thu,  4 Sep 2003 21:40:02 -0400 (EDT)


Section 3.2 of rdf-testcases [1] states:

   \UHHHHHHHH
       8 required hexadecimal digits HHHHHHHH encoding
       character [#x10000-#x10FFFF]

Which implies that any code point over U+10FFFF cannot be represented
in an N-Triples string, unless encoded as a surrogate block. However,
the test.nt N-Triples test file [2] referenced from rdf-testcases
contains the following literal production instances:

   "\U001FFFFF" # resource18
   "\U03FFFFFF" # resource19
   "\U7FFFFFFF" # resource20

Each of which are greater than U+10FFFF. Is the rdf-testcases in
error, or test.nt, or neither?

[1] http://www.w3.org/TR/rdf-testcases/#ntrip_strings
- 3.2 Strings. W3C Working Draft 23 January 2003
[2] http://www.w3.org/2000/10/rdf-tests/rdfcore/ntriples/test.nt
- $Id: test.nt,v 1.6 2003/08/03 16:07:09 dbeckett2 Exp $

--
Sean B. Palmer, http://purl.org/net/sbp/
"phenomicity by the bucketful" - http://miscoranda.com/

----- End forwarded message -----

Received on Friday, 5 September 2003 04:35:58 UTC