- From: Martin Duerst <duerst@w3.org>
- Date: Sun, 19 Oct 2003 10:26:19 -0400
- To: "Ashok Malhotra" <ashokma@microsoft.com>, <w3c-i18n-ig@w3.org>, <w3c-xml-query-wg@w3.org>
- Cc: <public-qt-comments@w3.org>, msm@w3.org
Some additions: At 16:09 03/10/17 -0400, Martin Duerst wrote: >At 08:28 03/10/16 -0700, Ashok Malhotra wrote: >>The XML Schema WG asked that the references to Unicode be consistent. See >><http://lists.w3.org/Archives/Public/public-qt-comments/2003Aug/0003.html> >>h ttp://lists.w3.org/Archives/Public/public-qt-comments/2003Aug/0003.html > >I have looked through this document. I have not found any such request. >(I have copied Michael Sperberg-McQueen, maybe he can help) >The closest I have found is >http://www.w3.org/XML/Group/2003/07/xmlschema-fo-comments.html#d0e327 >This refers to the fact that Unicode 2.0 and Unicode 3.0 do not >clearly outlaw encoding of non-BMP characters in six bytes (using >two surrogate codepoints). > >If the intent of the XML Schema WG was to request that XML Query >should in any way tolerate such encoding, then this would not >be appropriate. XML 1.0 references RFC 2279 (see >http://www.w3.org/TR/REC-xml#sec-external-ent), which is already >clear that non-BMP characters have to be encoded as four bytes. >This would also be a bad idea given that there are known security >problems connected with overlong UTF-8 byte sequences, Please note the erratum http://www.w3.org/XML/xml-V10-2e-errata#E27, which also makes clear that XML doesn't allow overlong/irregular encodings. Regards, Martin.
Received on Sunday, 19 October 2003 10:42:30 UTC