- From: Anne van Kesteren <annevk@opera.com>
- Date: Thu, 05 Feb 2009 10:14:10 +0100
- To: "Robert J Burns" <rob@robburns.com>
- Cc: "Aryeh Gregor" <Simetrical+w3c@gmail.com>, public-i18n-core@w3.org, jonathan@jfkew.plus.com, "W3C Style List" <www-style@w3.org>
On Wed, 04 Feb 2009 22:07:59 +0100, Robert J Burns <rob@robburns.com> wrote: > [...] If you meant that XML is Unicode normalization agnostic in that it > doesn't care (or know?) whether two canonically equivalent strings are a > match then there I disagree with that. Unicode is fairly clear that two > canonically equivalent strings are equivalent even if their code points > differ. That's what I mean. There are many different comparison algorithms. Unicode definitely does not make it non-conforming to compare two strings codepoint for codepoint. I'm not sure why you think it does. >> The XML grammar is expressed in Unicode codepoints so comparison also >> happens on that level. > > However Unicode has a SHOULD requirement that two canonically equivalent > but codepoint differing strings match. Unicode's Chapter 3 (C6 norm) > says: > >> A process shall not assume that the interpretations of two canonical- >> equivalent character sequences are distinct. I suggest to read all of C6. Martin Dürst already pointed out long ago that this does not always apply: http://lists.w3.org/Archives/Public/www-style/2009Feb/0020.html -- Anne van Kesteren <http://annevankesteren.nl/> <http://www.opera.com/>
Received on Thursday, 5 February 2009 09:15:10 UTC