Re: numerals for microformats

On Jul 10, 2007, at 10:37 PM, Robert Burns wrote:

> This post led to confusion on the IRC, so let me clarify. Since the  
> draft currently supports several internationalized character  
> variants for the denominator punctuation character, my suggestion  
> was for the draft to also support the character variants of the  
> (ASCII  region) Indic-Arabic numerals (I think there are around  
> 23-29 sets of them depending on how you count them).  These all  
> share the same properties with the ASCII encoded numerals, but are  
> expected to display a glyph more appropriate  for the resident  
> script (why Unicode did this I have no idea).

OK, I have some thoughts on why Unicode did this, but I not sure  
we're going to fix it with our recommendations. I imagine this was  
done because font support and text layout support are not yet  
sufficient and sufficiently widely deployed to deal with glyph  
substitution for either language or script specific contexts. That is  
for scripts that use the same Indic-Arabic digits with different  
glyphs, they couldn't count on adequate software support to handle  
this in the near term: so the characters are dis-unified (i.e., the  
characters for 0 through 9 each appear 23 times in the BMP).

Perhaps we want HTML to raise the bar here and expect better glyph  
layout and better OpenType or GX font support. Somehow that seems  
more ambitious than simply adding the support for these numeral  
character variants. I don't know enough about this to even know if  
OpenType has glyph substitution capabilities for the context of a  
script. I haven't heard of it, if it does.

The other general category numbers (Nl and No) are a different story.  
Those characters have different semantics. The Suzhou (huāmǎ or  
hangzhou) numerals . Those are used in a different way than the digits.

Take care,

Received on Wednesday, 11 July 2007 05:23:14 UTC