- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Tue, 3 Feb 2009 09:53:35 -0500
- To: Robert J Burns <rob@robburns.com>
- Cc: public-i18n-core@w3.org, jonathan@jfkew.plus.com, W3C Style List <www-style@w3.org>
On Tue, Feb 3, 2009 at 5:04 AM, Robert J Burns <rob@robburns.com> wrote: > The problem with this is that there would have to be a prior agreement so > that a Unicode processing application could count on everything received > already as NFC and that's simply not the case. If a Unicode UA is incapable > of processing NFD (which also implies it cannot process NFC characters that > are combining characters) then it would be up to that application to convert > internally to something it could handle (just what conversion it would do, I > don't know). Who's talking about a Unicode UA being unable to process NFD? The question on the table seems to be whether UAs should normalize all input to NFC when they parse it. This would permit them to process NFC, NFD, or any other normalized or non-normalized input. They would then probably end up sending responses like form data in NFC even if they received the original input in NFD. If the server prefers to use NFD internally, it's up to the server to then convert back to NFD on its end. We aren't really talking about transmission formats here, AFAICT, or at least that wasn't the original question. The question is whether it's acceptable for browsers to internally normalize all input somehow (to NFC, NFD, whatever) as soon as it's received, so that they can ensure that they make correct comparisons according to the Unicode standard. This is relevant to CSS because it seems to be the best way of ensuring that CSS comparisons aren't normalization-sensitive. I'm not clear on what exactly the objections are to that, other than possibly violating the XML standard (it would be surprising to me if that did violate XML). The only practical objection I can see is that some sites might be broken and not do normalization themselves. You could have something like user registers with a name in NFD (or entirely unnormalized) in non-normalizing browser -> site saves to database -> same user tries to log in later in a normalizing browser -> login fails because site thinks the names are different. I don't know whether this would be a problem in practice.
Received on Tuesday, 3 February 2009 14:54:10 UTC