- From: Martin J. Duerst <duerst@w3.org>
- Date: Tue, 10 Oct 2000 12:50:06 +0900
- To: "Yves" <yves@opentag.com>, <www-international@w3.org>
At 00/10/09 23:30 -0400, Yves wrote: >Hello Martin, > >Thanks, I think I understand better now: > >There is nothing special to do to encode surrogates for XML, we just apply >the UTF encodings. But *once parsed*, the XML text (or tags) cannot include >the high or low part of a surrogate as single 'charatacter'. The XML char >definition talks about scalar values (UCS as coded character set) not >encoded ones (encodings of UCS). > >And now I assume it also means we cannot have a surrogate pair coded as 2 >NCRs. For example: <U+D801,U+DC05> would be written "𐐅" not >"��"? Yes, exactly! Regards, Martin.
Received on Monday, 9 October 2000 23:50:39 UTC