RE: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization

The question here is one of interpretation. Anne points out that, at least theoretically, it is possible to create XML document schemas that define two semantically identical names that are encoded using different code point sequences. This, of course, is an Extremely Bad Idea, since, among other things, such a document might not live through a transcoding to another character encoding or other forms of processing. Although Anne pointed to XML 1.1, in fact, XML 1.0 5e also includes the same recommendations:

  http://www.w3.org/TR/xml/#sec-suggested-names


The real question is: what feature is more important to preserve? The non-normalizability of XML names (which is deprecated anyway)?

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: Boris Zbarsky [mailto:bzbarsky@MIT.EDU]
> Sent: Monday, February 02, 2009 11:27 AM
> To: Mark Davis
> Cc: Phillips, Addison; public-i18n-core@w3.org; www-style@w3.org
> Subject: Re: [CSS21][css3-namespace][css3-page][css3-
> selectors][css3-content] Unicode Normalization
> 
> Mark Davis wrote:
> > When you are tokenizing, and then doing comparison, the simplest
> > approach is to normalize when creating the tokens.
> 
> At least one post to this thread has stated that doing that (parse-
> time
> normalization) is not acceptable.
> 
> Clearly either I'm missing something, or there is significant
> disagreement on what the acceptable behaviors are here...
> 
> -Boris

Received on Monday, 2 February 2009 21:05:26 UTC