Re: [Moderator Action] surrogates in xml

From: Martin J. Duerst
Date: Tue, 10 Oct 2000
Message-Id: <>
To: "Yves" <yves@opentag.com>, <www-international@w3.org>
At 00/10/09 23:30 -0400, Yves wrote:
>Hello Martin,
>Thanks, I think I understand better now:
>There is nothing special to do to encode surrogates for XML, we just apply
>the UTF encodings. But *once parsed*, the XML text (or tags) cannot include
>the high or low part of a surrogate as single 'charatacter'. The XML char
>definition talks about scalar values (UCS as coded character set) not
>encoded ones (encodings of UCS).
>And now I assume it also means we cannot have a surrogate pair coded as 2
>NCRs. For example: <U+D801,U+DC05> would be written "&#x10405;" not

Yes, exactly!

Regards,   Martin.
