Re: editorial buglet in n-triples from Dave Beckett on 2003-06-04 (www-rdf-comments@w3.org from April to June 2003)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Wed, 4 Jun 2003 16:32:11 +0100
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: www-rdf-comments@w3.org
Message-Id: <20030604163211.0f7be80c.dave.beckett@bristol.ac.uk>

On Tue, 27 May 2003 14:40:07 -0400 (EDT)
"Peter F. Patel-Schneider" <pfps@research.bell-labs.com> wrote:

> 
> The N-Triples strings table, as written, translates '\' to '\', because '\'
> is in the US-ASCII range.

I will change
  "characters outside the US-ASCII range are made available by
  \-escape sequences as follows:"
to
  "characters outside the US-ASCII range and some specific characters
   are made available by \-escape sequences as follows:"


> It would probably be better to have an explicit translation for all
> characters.


This would be possible but just rather longer than the current table.

Something like:

--------------------------------------------------
Unicode characters        N-Triples encoding
(with code point <em>u</em>)

[#x0-#x8]                 \u<em>HHHH</em> 
                          4 required hexadecimal digits <em>HHHH</em>
		          encoding Unicode character <em>u</em>
		      		       		       		       
#x9                       \t
		      		       		       		       
#xA                       \n
		      		       		       		       
[#xB-#xC]                 \u<em>HHHH</em>
                          4 required hexadecimal digits <em>HHHH</em>
                          encoding Unicode character <em>u</em>
		      		       		       		       
#xD                       \r
		      		       		       		       
[#xE-#x1F]                \u<em>HHHH</em>
                          4 required hexadecimal digits <em>HHHH</em>
                          encoding Unicode character <em>u</em>
		      		       		       		       
[#x20-#x21]               the character <em>u</em>
		      		       		       		       
#x22                      \"
		      		       		       		       
[#x23-#x5B]               the character <em>u</em>
		      		       		       		       
#x5C                      \\
		      		       		       		       
[#x5D-#x7E]               the character <em>u</em>
		      		       		       		       
[#x7F-#xFFFF]             \u<em>HHHH</em>
                          4 required hexadecimal digits <em>HHHH</em>
                          encoding Unicode character <em>u</em>
		      		       		       		       
[#10000-#x10FFFF]         \U<em>HHHHHHHH</em>
                          8 required hexadecimal digits <em>HHHHHHHH</em>
                          encoding Unicode character <em>u</em>

--------------------------------------------------

This remains a 1-to-1 encoding.

Dave

Received on Wednesday, 4 June 2003 11:34:28 UTC