W3C home > Mailing lists > Public > www-international@w3.org > October to December 2000

Re: [Moderator Action] surrogates in xml

From: Martin J. Duerst <duerst@w3.org>
Date: Tue, 10 Oct 2000 12:50:06 +0900
Message-Id: <>
To: "Yves" <yves@opentag.com>, <www-international@w3.org>
At 00/10/09 23:30 -0400, Yves wrote:
>Hello Martin,
>Thanks, I think I understand better now:
>There is nothing special to do to encode surrogates for XML, we just apply
>the UTF encodings. But *once parsed*, the XML text (or tags) cannot include
>the high or low part of a surrogate as single 'charatacter'. The XML char
>definition talks about scalar values (UCS as coded character set) not
>encoded ones (encodings of UCS).
>And now I assume it also means we cannot have a surrogate pair coded as 2
>NCRs. For example: <U+D801,U+DC05> would be written "&#x10405;" not

Yes, exactly!

Regards,   Martin.
Received on Monday, 9 October 2000 23:50:39 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:20 UTC