W3C home > Mailing lists > Public > www-validator@w3.org > August 2002

Re: validator/htdocs/sgml-lib xml.dcl

From: Martin Duerst <duerst@w3.org>
Date: Tue, 20 Aug 2002 12:10:32 +0900
Message-Id: <4.2.0.58.J.20020820115112.03c838a0@localhost>
To: Terje Bless <link@abyss.w3.org>, www-validator@w3.org

At 20:56 02/08/19 -0400, Terje Bless wrote:

>Update of /sources/public/validator/htdocs/sgml-lib
>In directory rux.w3.org:/temp/tmp/cvs-serv21396
>
>Modified Files:
>       Tag: validator-0_6_0
>         xml.dcl
>Log Message:
>Update XML Declartion to include full UNICODE range.

Hello Terje,

I'm sorry, but I have to disagree with this change.

Here's the diff:
http://dev.w3.org/cvsweb/validator/htdocs/sgml-lib/xml.dcl.diff?r1=1.1.2.1&r 
2=1.1.2.2

===================================================================
RCS file: /sources/public/validator/htdocs/sgml-lib/xml.dcl,v
retrieving revision 1.1.2.1
retrieving revision 1.1.2.2
diff -u -r1.1.2.1 -r1.1.2.2
--- validator/htdocs/sgml-lib/xml.dcl   2002/07/05 22:26:17     1.1.2.1
+++ validator/htdocs/sgml-lib/xml.dcl   2002/08/20 00:56:16     1.1.2.2
@@ -25,10 +25,13 @@
                 127        1  UNUSED
                 128       32  UNUSED
                 160    55136     160
-             55296     2048  UNUSED  -- surrogates --
+             55296     2048  UNUSED -- surrogates --
               57344     8190   57344
-             65534        2  UNUSED  -- FFFE and FFFF --
+             65534        2  UNUSED -- FFFE and FFFF --
               65536  1048576   65536
+           1114112 14680064 1114112 -- Outside BMP --
+
+

       CAPACITY NONE  -- Capacities are not restricted in XML --
===================================================================

However, the line that designates the 16 planes outside the
BMP allowed by Unicode and addressable in UTF-16 is this one:

               65536  1048576   65536 -- 16 planes outside BMP --

The line you added

+           1114112 14680064 1114112 -- Outside BMP --

is some 240 (why 240) additional planes that neither ISO/IEC
SC2/WG2 (responsible for ISO 10646) nor the Unicode Consortium
plan to use in any way, and that are clearly and strictly not
allowed in XML (see http://www.w3.org/TR/REC-xml#charsets,
production [2]; 10FFFF converted to decimal is 1114111).

If you got that from anywhere, please tell me where you got it
from.

I recommitted the change, but I'm not sure I got it into
the right branch.

Regards,    Martin.
Received on Monday, 19 August 2002 23:30:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:03 GMT