Re: Unicode Normalization from Benjamin Blanco on 2009-02-05 (public-i18n-core@w3.org from January to March 2009)

From: Benjamin Blanco <benjo316@gmail.com>
Date: Thu, 5 Feb 2009 08:02:10 -0600
To: Robert J Burns <rob@robburns.com>
Cc: Anne van Kesteren <annevk@opera.com>, Aryeh Gregor <Simetrical+w3c@gmail.com>, public-i18n-core@w3.org, jonathan@jfkew.plus.com, W3C Style List <www-style@w3.org>
Message-ID: <421e3c790902050602r32416884k607519d5ab434e1c@mail.gmail.com>

On Thu, Feb 5, 2009 at 1:06 AM, Robert J Burns <rob@robburns.com> wrote:

> Hi Benjamin,
> On Feb 4, 2009, at 9:17 PM, Benjamin wrote:
>
> Also, I can see a difference between the characters; The two brackets at
> the top and the one on the bottom left are duller, while the other three are
> sharper. This difference is apparent in both the browser and the text
> editor(Not sure if it matters, though).
>
>
> I would say that is a bug in your font. Fonts, by using separate glyphs for
> canonically equivalent characters, contribute to the confusion authors face
> when creating content. The glyph distinctions lead authors to treat the
> characters semantically distinct (which shouldn't happen). Fonts play an
> important role in this (on par with input systems) since the fonts control
> the glyphs used. For example if a font uses the same glyphs for "½" as the
> font maker uses for the compatibility equivalent sequence "1⁄2", this helps
> with Unicode authoring. It is remarkable how few font makers take minimal
> amount of time necessary to do this. This is a similar problem to font/glyph
> issues outlined earlier by Andrew Cunningham with various African and
> Eastern languages.
>

I've tried several different fonts, and they all render the glyphs
differently, despite canonical equivalence. I don't have any Microsoft or
Apple fonts (as far as I know), but I wonder if they handle this the same.
Regardless, should I try to contact the authors of the various fonts and
point them to this list (or this thread, at least)? I'm sure they could
easily change the fonts to comply with the specification, they probably
would be distributed to anyone using a Linux distribution automatically, and
I doubt it would create a mass outcry.

>
> My feeling is that these are the types of things we should NOT be expecting
> authors to deal with. The font makers should spend the extra time to
> understand these issues and design their fonts accordingly. The input system
> software developers should do the same. The parsers for XML, HTML, CSS,
> Javascript and so on should normalize strings in a way that authors never
> have to think about these issues with Unicode.
>
> Take care,
> Rob
>

The thing is, font makers are authors, too. Some may be in big corporations,
and others in their mother's basement. Who's to tell them about the Unicode
standard? I definitely think if they do know, they should try to understand,
etc. If you wanted a different rendering of a certain glyph, I assume you
would get a different font rather than use another glyph in the same font;
it would make sense to me, anyway.

Maybe if we had a way to contact all font authors, like this mailing list
(but for fonts), we could avoid wide-spread issues such as this. As it is
now, in order to fix the problem, I (or we) would have to contact many
authors separately. While it may not make sense for most other uses, I think
a central mailing list would be extremely useful for certain things, like
this. But I suppose this isn't really the best place to bring up this issue.

Received on Thursday, 5 February 2009 14:02:46 UTC