RE: [selectors-api] Selectors API I18N Review...

I think Daniel was trying to illustrate the character sequence rather than saying that there is a bug in the renderer. On a Mac, he sees an e-accute. It is encoded as <0065 0301> rather than as <00E9>, at least in the filesystem. Either is acceptable in a Unicode text file: they are canonically equivalent, after all.

The W3C has long recommended (or at least tried to recommend via CharMod-Norm) the use of NFC for interchange on the Web. Even absent any changes to Selectors, that recommendation is unlikely to go away. What we're discussing is whether specifications sensitive to normalization should deal with the fact that many systems (Macs are not alone in this) do not use NFC at all times or for all languages. 

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> request@w3.org] On Behalf Of Thomas Phinney
> Sent: Tuesday, February 10, 2009 1:19 PM
> To: Mark Davis
> Cc: Daniel Glazman; Richard Ishida; Anne van Kesteren; Henri
> Sivonen; public-i18n-core@w3.org; www-style@w3.org; Peter Edberg;
> Deborah Goldsmith
> Subject: Re: [selectors-api] Selectors API I18N Review...
> 
> 
> What Daniel describes is likely the result of viewing NFD text, if
> the
> combination of viewing app, font and system-level support do not
> support composing that text by making the accent move onto the base
> letter (a.k.a. "mark attachment"). In other words, conversion to
> NFD
> can in some situations "break" the visual expression of text, and
> that
> is more likely with NFD conversion than with, say NFC.
> 
> Broadly speaking, I would not suggest promoting NFD as a
> general-purpose recommended normalization form, for these sorts of
> reasons.
> 
> Without knowing any details (OS version, application + version,
> font +
> format + version), it is difficult to say much more than that.
> (I'll
> note in passing that Apple has been improving their support for
> mark
> attachment at the system level, adding OpenType support in addition
> to
> their previous AAT support. I gather there were still some issues
> in
> 10.5 that may have been improved in 10.6.)
> 
> Cheers,
> 
> T
> 
> 
> On Tue, Feb 10, 2009 at 12:52 PM, Mark Davis <mark.davis@icu-
> project.org> wrote:
> > You appear to be mistaken. The definition of Unicode
> Normalization does not
> > depend on the platform.
> >
> > Now, what you may mean is that the implementation of Unicode
> Normalization
> > on the Mac is different than on other programs. I would be quite
> surprised
> > to hear that (cc'ing also some Apple folks who would know).
> >
> > Alternately, you could mean that the Mac uses different Unicode
> > Normalization than you expect, since there are 4 different forms
> of Unicode
> > Normalization (NFC, NFD, NFKC, NFKD). I believe that Apple is
> using NFD in
> > the file system internally, but don't know whether or how this is
> apparent
> > to users. The Mac also does support NFC, however, which is the
> preferred
> > form for interchange.
> >
> > Mark
> >
> >
> > On Tue, Feb 10, 2009 at 12:11, Daniel Glazman
> > <daniel.glazman@disruptive-innovations.com> wrote:
> >>
> >> Richard Ishida wrote:
> >>>
> >>> Hi Anne,
> >>>
> >>> No, this is a theoretical outcome if some browsers did start
> normalizing.
> >>>  I don't know of any that do at the moment - though I haven't
> exactly
> >>> scoured the whole list of UAs at this point.
> >>
> >>
> >> Sorry to interrupt this discussion, I originally missed it
> because it
> >> ended up in my spam folder for some strange reason.
> >>
> >> I am using a Mac. On Mac, Unicode normalization gives me e' for
> >> &eacute; while most other systems will give me é.
> >>
> >> So if I start authoring on my Mac for instance a document and
> have
> >> to use a corporate stylesheet made on a PC, both instances using
> >> class "barré", do I take the risk of having my styles not
> applied ?
> >> If the answer is yes, it's unacceptable, even the input method
> or
> >> the OS is guilty. The user cannot have to worry about that and
> must
> >> be provided with a workable solution. A solution that chokes on
> acute
> >> e in french is not workable, even in the name of purity of the
> solution.
> >>
> >> </Daniel>
> >>
> >>
> >
> >

Received on Tuesday, 10 February 2009 21:28:28 UTC