RE: [CSS21][css3-namespace][css3-page][css3-selectors][css3-content] Unicode Normalization from Richard Ishida on 2009-01-30 (public-i18n-core@w3.org from January to March 2009)

From: Richard Ishida <ishida@w3.org>
Date: Fri, 30 Jan 2009 15:34:35 -0000
To: "'Anne van Kesteren'" <annevk@opera.com>, <public-i18n-core@w3.org>, <www-style@w3.org>
Message-ID: <005201c982f0$41f61ce0$c5e256a0$@org>

> It seems that having authoring tools that just output e.g. NFC would do
> the trick for those users.

Unfortunately it's not so simple.  You can't predict or constrain people as to what editing tools they use, and in languages like HTML and  CSS, which are designed to be readable source that is editable by simple text editors, such as notepad, you can't rely on the editor to do the right thing.  The complicating factor in this is that if you use a keyboard on a Mac (I'm talking *keyboard* here, not editor) it naturally and consistently tends to produce a sequence of characters that a keyboard on a Windows PC will not.  

In other cases, people may use a legacy encoding that doesn't actually support the same precomposed or decomposed character sequences as Unicode NFC does (such as Windows CP1258 for Vietnamese), but the text should be considered canonically equivalent when transcoded to Unicode for use in the user agent.

RI


============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/



> -----Original Message-----
> From: Anne van Kesteren [mailto:annevk@opera.com]
> Sent: 30 January 2009 15:02
> To: Richard Ishida; public-i18n-core@w3.org; www-style@w3.org
> Subject: Re: [CSS21][css3-namespace][css3-page][css3-selectors][css3-
> content] Unicode Normalization
> 
> On Fri, 30 Jan 2009 15:54:45 +0100, Richard Ishida <ishida@w3.org> wrote:
> >> 1) Do browsers normalize currently?
> >
> > The emails below point to some tests and results that show that the
> > major browsers on XP currently don't normalise class names and selectors
> > before comparing.
> 
> Ok, that's what I expected.
> 
> 
> >> 2) Assuming they do not, who have complained?
> >
> > I don't think we can build the Web purely on the basis of what people
> > have complained about in the past.  The Web needs to be built in the
> > most accessible way possible, and we need to, where we can, anticipate
> > future issues.  In this case, these issues are likely to touch most on
> > developing parts of the world who probably haven't yet found their voice
> > to a large extent.  But I suspect they will, and I think we have a
> > responsibility to bear them in mind so that the Web can be used more
> > universally, at the same feature and usability levels.  Consider the
> > hoops we are asking them to jump through as alluded to in the mail
> > pointed to below.  Wouldn't you complain?
> 
> It seems that having authoring tools that just output e.g. NFC would do
> the trick for those users. Letting Unicode Normalization affect IDs, class
> names, and HTML parsing does not seem like a good idea at all. Especially
> not retroactively as that could mean that duplicate IDs arise documents
> that do not have them per todays rules. As far as I can tell XML does not
> do this either.
> 
> 
> --
> Anne van Kesteren
> <http://annevankesteren.nl/>
> <http://www.opera.com/>

Received on Friday, 30 January 2009 15:34:38 UTC