Re: I18N issues for Widgets Spec [Was: Re: [Widgets] ASCII File names - request for comments]

<Feedback from one of the PC experts in IBM - Ken Borgendale  --
kwb@us.ibm.com >

It seems to me that the problem here is that MacOS has a non-conforming
implementation of zip.  My first suggestion would be to fix that problem.

On the other hand, there is a large amount of redundancy in the UTF-8
encoding and if you only need to distinguish between Cp437 and UTF-8 you
could determine the encoding correctly in almost all cases.  Any valid
UTF-8 sequence which is not ASCII7 has at least two adjacent byte >0x7F
with the final one > 0xBF.  The simple rule would be: if the string is
valid UTF-8, process it as UTF-8, otherwise as Cp437.
========

Best regards, Uma
V.S. UMAmaheswaran, Ph.D.
Globalization Centre of Competency, IBM Toronto Lab
A2/SZ8, 8200 Warden Avenue, Markham, ON, Canada, L6G1C7; +1 905 413 3474;
Fax:905 413 4682; TieLine 313-3474; email: umavs@ca.ibm.com

Received on Monday, 3 December 2007 21:15:23 UTC