- From: François Yergeau <francois@yergeau.com>
- Date: Thu, 24 Jun 2004 12:04:09 -0400
- To: Tim Bray <tbray@textuality.com>
- Cc: www-i18n-comments@w3.org
Hi Tim, I was charged with contacting you about one of your comments on the Character Model. This particular comment (our number LC031) was about section 4.6 Character Escaping. You wrote: ===================== Third EXAMPLE This is incorrect. Within CDATA sections, � is perfectly legal and just encodes a string of 8 ASCII characters. Outside of CDATA sections "�" is illegal, but that's an XML thing, not a CDATA section thing. ===================== The example in question reads: ===================== EXAMPLE: XML defines 'CDATA sections' which allow escaping the syntax-significance of all characters between the CDATA section delimiters. CDATA sections do not allow the expression of unrepresentable characters and in fact prevent their expression using numeric character references. ===================== We were not sure how to interpret your comment, since 'unrepresentable character' in the example doesn't refer to things high surrogates, but to point 2 of the list just above the examples: "2. expressing characters not representable in the character encoding chosen for an instance of the language, or". A high surrogate such as #xd801 is not a character at all, so it cannot be what 'unrepresentable character' refers to. Instead, an unrepresentable character would be for instance a Chinese ideograph when the chosen encoding does not contain Chinese ideographs in its repertoire. We tentatively decided to accept your comment as a request for clarification, since it would seem that it came from a misunderstanding, requiring Charmod to be clearer in this area. I'm writing today to ask you whether we correctly interpreted your comment, or if there's something else we should take into account. Regards, -- François Yergeau
Received on Thursday, 24 June 2004 12:04:11 UTC