- From: Robert J Burns <rob@robburns.com>
- Date: Wed, 4 Feb 2009 16:10:16 -0600
- To: "Anne van Kesteren" <annevk@opera.com>
- Cc: "Aryeh Gregor" <Simetrical+w3c@gmail.com>, public-i18n-core@w3.org, jonathan@jfkew.plus.com, "W3C Style List" <www-style@w3.org>
Hi Brad, Brad Kemper <brad.kemper@gmail.com> > Sent from my iPhone > On Feb 4, 2009, at 1:07 PM, Robert J Burns <rob@robburns.com> wrote: > > > However Unicode has a SHOULD requirement that two canonically > > equivalent but codepoint differing strings match. Unicode's Chapter > > 3 (C6 norm) says: > >>> > >> > >> > > > >> A process shall not assume that the interpretations of two > >> canonical-equivalent character sequences are distinct. > > Your interpretation adds something that your quoted text does not > include. The quoted text does not include "but code point differing". > It seems quite clear (at least when read in isolation from the rest of > the spec) that its simply saying that two canonical-equivalent > character sequences MAY not be distinct. If they are are not code > point differing then they wouldn't be distinct. Otherwise they would > be. Certainly, there is something missing from the criterion there. However, your interpretation doesn't fill in that (I understand your on an iPhone, but I still need to point that out). In other words without the "adds something" in my interpretation, something else needs to be added to make sense of the Unicode C6 conformance norm. I don't think we're interpreting this norm as saying that two canonically equivalent character sequences that are also code point equivalent character sequences are not unique. If that's all that criterion says, then why even mention canonical equivalence. The Unicode standard would simply say that "UAs must treat the equivalence of any character sequences the same as the code point equivalence for the underlying code point sequences". There would be no need to mention canonical equivalence. In fact there would be no reason to even introduce the concept of canonical equivalence in the Unicode standard. Such an interpretation as the one you're proposing strains credibility for me. Granted there may be other interpretation than either you or I have offered, and I welcome hearing those, but that's not really a credible one. Take care, Rob
Received on Wednesday, 4 February 2009 22:11:01 UTC