W3C home > Mailing lists > Public > www-international@w3.org > April to June 2003

RE: Why is UTF8 not being taken up in Asia Pacific for Public Website s?

From: Addison Phillips [wM] <aphillips@webmethods.com>
Date: Fri, 16 May 2003 09:18:47 -0400
To: "LUNDER,BEN \(HP-Australia,ex3\) \(by way of Martin Duerst <duerst@w3.org>\)" <ben.lunder@hp.com>, <www-international@w3.org>
Message-ID: <PNEHIBAMBMLHDMJDDFLHKEJGGKAA.aphillips@webmethods.com>

Hi Ben,

Historically the browser support issue has been the leading reason for lack
of Unicode uptake. Although you say that the major browsers have had support
for a long time, it really wasn't until Netscape 6 that UNIX users had a
browser that did UTF-8, and the half-life of browsers mean that there are
still plenty of version 4.x browsers in use. If 5% of your audience can view
your page, well that's not bad, right? But 5% of an audience of 1M is 50,000
users....

There are other reasons that Asian sites haven't switched to Unicode
presentation.

Authoring tools, databases, existing content, and so on are all generally in
a specific locally preferred encoding. Switching these to UTF-8 is a larger
project. Given that the legacy encodings are familiar to users in those
countries and meet their needs at least as well as UTF-8, there is no
incentive to switch.

The other reason is that many sites actually HAVE switched to Unicode...
internally. A lot of dynamic page technologies (like ASP or JSP) are or
allow Unicode internally during page composition, and only convert to a
legacy encoding on delivery of the page to the browser.

All that said, it depends on your application and who your audience is.
webMethods web administration tools (in all languages) are all delivered as
UTF-8 and few users have noticed and none have complained. Our customers are
more concerned about the ability to deliver legacy encoded pages to their
customers or trading partners, although this is generally due to the (in my
experience unquantified) feeling that their customers would prefer that or
simple unfamiliarity with using Unicode.

There are cases where the legacy encoding is preferred for a specific
reason. Knowing what that is can be important. IOW, this is becoming (but
isn't quite yet) purely a cultural preference rather than a technological
issue.

Best Regards,

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods, Inc.

+1 408.962.5487 (phone)  +1 408.210.3569 (mobile)
-------------------------------------------------
Internationalization is an architecture.
It is not a feature.

Chair, W3C-I18N-WG Web Services Task Force
To participate see http://www.w3.org/International/ws

> -----Original Message-----
> From: www-international-request@w3.org
> [mailto:www-international-request@w3.org]On Behalf Of LUNDER,BEN
> (HP-Australia,ex3) (by way of Martin Duerst <duerst@w3.org>)
> Sent: Friday, May 16, 2003 6:43 AM
> To: www-international@w3.org
> Subject: Why is UTF8 not being taken up in Asia Pacific for Public
> Website s?
>
>
>
>
>
>
> Hello all,
>
> I was doing a little research into the possibility of presenting
> some of our
> North Asian sites in UTF8 and was wondering why UTF8 is not widely used in
> China, Korea, Japan and Taiwan for encoding of webpages.
>
> Almost all the commercial websites I have found with the
> exception of a few
> use the following encodings
> China = gb2312
> Korea = euc_kr
> Japan = shift_jis
> Taiwan = big5
>
> Here is an example of some of my findings for usage of encodings on Public
> Websites.
> -------------------------------------------
> |Org.     |China |Taiwan| Japan     |Korea |
> -------------------------------------------
> |Epson    |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Samsung  |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Canon    |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Sony     |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |IBM      |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Dell     |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Kyocera  |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Lexmark  |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |Sun      |gb2312| Big5 |ISO-2022-jp|Euc_kr|
> |Toshiba  |gb2312| Big5 |Shift_JIS  |Euc_kr|
> |google   |utf-8 | utf-8|utf-8      |utf-8 |*
> -------------------------------------------
> *google does not have a china website but you can search for simplified
> Chinese pages
>
>
> Do you know why utf8 has not been taken up more quickly?
>
> I suspect it is because of difficulties with configuring
> older browsers to support UTF8 but
> I have not found specific recent evidence to support this
> suspicion.  To the
> contrary, major browsers have been officially supporting UTF8 for
> some time.
>
> I also suspect that there may be hard to find information about
> known issues
> with
> browsers when pages are encoded as UTF8 and the language is Traditional
> Chinese, Simplified Chinese, Japanese or Korean.  I have found some
> information
> on this mailing list relating to problems with Japanese and UTF-8 but not
> other
> languages.  If UTF8 supports all
> the characters in the gb2312,euc_kr,shift_jis and Big5 encoding
> schemes then
> why is UTF8 not being used more generally throughout Asia?
>
> If you know of a source(s) of information which answers my query
> I would be
> most appreciative.
>
> Kind regards,
>
> Benjamin Lunder
> Hewlett-Packard
Received on Friday, 16 May 2003 09:19:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:00 GMT