Re: Unicode caseless matching details [I18N-ACTION-198] from John Daggett on 2013-04-17 (www-style@w3.org from April 2013)

From: John Daggett <jdaggett@mozilla.com>
Date: Tue, 16 Apr 2013 22:15:53 -0700 (PDT)
To: Addison Phillips <addison@lab126.com>
Cc: www-international@w3.org, "CSS WWW Style (www-style@w3.org)" <www-style@w3.org>
Message-ID: <51086352.7882085.1366175753587.JavaMail.root@mozilla.com>

Addison Phillips wrote:

> Basically, the thinking here was that, since font systems are
> somewhat diverse and fonts themselves use different encoded
> sequences, capitalizations, and other variations, this is a case in
> which both Unicode normalization and Unicode case folding are
> practical and justified. We would therefore recommend that you
> require Unicode NFC normalization and Unicode C+F case folding when
> comparing font names for selection. We think this is a special case
> because it is isolated and should have no side-effects on other
> parts of the Web, such as Selectors. It merely ensures that a given
> style sheet has the greatest likelihood of matching the intended
> font names as represented in the underlying system.

So normalization is now a requirement and only in the case of fonts?!?

I feel pretty strongly that this is *not* the right decision here, in
particular the algorithm for matching font family names should not be
any different from caseless matching used in other places on the web
platform.

Font family names are matched in two separate and distinct ways.  They
are matched against platform font family names where the name is taken
from data in the font.  They are also matched against family names
defined in @font-face rules, where the author controls both the
definition and use of the name and the name data contained in font
data is ignored.

In the case of font family name matching we're stuck supporting some
form of caseless matching because that's the way CSS was originally
defined, "arial" must match "Arial" or the web will break. In
practice, platform font *family* names in general aren't localized,
it's only generally done in CJK fonts where the scripts used are
caseless.  In the case of font family names in @font-face rules,
that's entirely part of CSS and as such should follow the conventions
of a general, web-wide "Unicode caseless matching" algorithm.

I think the CSS3 Fonts spec should only define something that's
aligned to what Unicode caseless matching on the web should be, I
don't think it should require a special-case, fonts-only matching
algorithm. Ideally, the Fonts spec could simply point to an algorithm
that defined "Unicode caseless matching for web content" but at
present that doesn't exist.

Regards,

John Daggett
CSS3 Fonts editor

Received on Wednesday, 17 April 2013 05:16:47 UTC