W3C home > Mailing lists > Public > www-archive@w3.org > March 2012

RE: big5 and big5-hkscs

From: Shawn Steele <Shawn.Steele@microsoft.com>
Date: Wed, 28 Mar 2012 17:14:46 +0000
To: Anne van Kesteren <annevk@opera.com>
CC: "www-archive@w3.org" <www-archive@w3.org>
Message-ID: <E14011F8737B524BB564B05FF748464A5B1B6036@TK5EX14MBXC139.redmond.corp.microsoft.com>
Ah, I didn't realize you were talking about the HKSCS code points, even though you clearly had HKSCS in the subject :)  Brain cramp. 

I'm not sure if I have a mapping table from PUA HKSCS to Real Unicode HKSCS code points, I'll see what I can find out.

-Shawn

-----Original Message-----
From: Anne van Kesteren [mailto:annevk@opera.com] 
Sent: Wednesday, March 28, 2012 10:12 AM
To: Shawn Steele
Cc: www-archive@w3.org
Subject: Re: big5 and big5-hkscs

On Wed, 28 Mar 2012 18:36:19 +0200, Shawn Steele <Shawn.Steele@microsoft.com> wrote:
> PUA == "Private Use Area", so people can show whatever glyphs they 
> want for whatever PUA code point they want.  It's more like per-font 
> or something than per-locale.  Different documents could use different 
> fonts to show different things.
>
> We map those to the Unicode PUA, there's no better Unicode code point.

Per
http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=12080

that seems untrue. "Legacy Unicode-encoded HKSCS documents and data records must be converted to Unicode 4.1 in order to remove the PUA code points. Secondly, most Big5-encoded HKSCS documents and data records will need to be converted to Unicode 4.1 to work properly on Windows and to take advantage of new characters provided in HKSCS-2004."


> FWIW: We have a mechanism where we allow "EUDC" characters to be 
> mapped.  The net result is that people can cause a specific font, of 
> their own creation, to be used as the fallback for the system for 
> those unknown PUA characters.  For a web site, that'd mean that if 
> they wanted to use the PUA, they'd either have to use a common 
> convention, or provide a font.  In either case I'd strongly recommend 
> that the web site developer used Unicode as, particularly in these 
> edge cases, the differences between implementation make it really hard 
> to be cross-platform.

Yes, but since big5-hkscs has code points in the same place as those PUA code points and Microsoft has shipped custom glyph mapping before for HKSCS (now claimed to be integrated in Windows), it does actually matter for other players how the default setup works in Windows for Hong Kong and Taiwan.


--
Anne van Kesteren
http://annevankesteren.nl/


Received on Wednesday, 28 March 2012 17:15:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 7 November 2012 14:18:48 GMT