W3C home > Mailing lists > Public > public-html-ig-zh@w3.org > April 2012

Re: 求助:關於Big5和Big5-HKSCS的問題

From: Yuan Chao <yuanchao@gmail.com>
Date: Fri, 13 Apr 2012 12:24:08 +0800
Message-ID: <CAADKi7kJeMY3H9-PE71PxXo0vDPhzM7pEKEpUmjk_jPhsBcW0g@mail.gmail.com>
To: Philip Jägenstedt <philipj@opera.com>, Kang-Hao Lu <kennyluck@w3.org>
Cc: Timothy Chien <timdream@gmail.com>, "public-html-ig-zh@w3.org" <public-html-ig-zh@w3.org>, Øistein E. Andersen <liszt@coq.no>, Anne van Kesteren <annevk@opera.com>
On Thu, Apr 12, 2012 at 7:20 PM, Philip Jägenstedt <philipj@opera.com> wrote:

>>> 關於Big5-HKSCS跟Big5-UAO的重疊問題,我們用了dotnetdotcom.org的資料找了有可能出問題的網頁:
> I've added the original list of Big5 URLs from <dotnetdotcom.org > to
> contains 140 .hk URLs and 133 .tw URLs. In other words, the sample does not
> appear to be unfairly biased to HK, it was just that the pages using the
> conflicting byte sequences were mostly HK-related.
Sorry that I didn't make it clear: what I meant is that the examples
of having conflicting mapping are mostly HK centric.

Let me sort it out the argument:
based on data collected from dotnetdotcom.org, you find that
big5-hkscs gives best mapping to unicode. (microsoft maps all the user
extension area of big5 to private user area, PAU, of unicode ) So you
suggest to merge the big5-hkscs to big5 as IE supports big5 only.

But I tried to point out is that big5-hkscs uses reserved extension
area in big5 which may conflict with other big5 variants like
big5-uao. As you pointed out the difference between big5 and
big5-hkscs in Firefox, that exactly the difference as Firefox adopted
big5-uao, which covers the most big5 variants used in Taiwan.

Also, from my understanding (sorry I use linux and I don't know HK
dialect at all), HK people used to have to apply a patch to read/write
HK dialect characters under windows. (including conversion table,
input method and extended font 細明體_HKSCS) Please refer to wikipedia
entry:
http://zh.wikipedia.org/wiki/%E9%A6%99%E6%B8%AF%E5%A2%9E%E8%A3%9C%E5%AD%97%E7%AC%A6%E9%9B%86

With the patch available here:
http://www.ogcio.gov.hk/tc/business/tech_promotion/ccli/download_area/font_and_software.htm
HK people's "big5" under windows will be effectively "big5-hkscs",
though still name it as "big5". So they won't have problem with IE
which supports "big5" only. (it's a patched "big5")

According to wikipedia, HK official stops the support of big5-hkscs.
All the exchanging document should use ISO 10646 (unicode) From
microsoft: http://www.microsoft.com/download/en/details.aspx?DisplayLang=en&id=12080
there will be no support starting windows vista, either.

也就是說這些需要big5-hkscs的文件,多半都已經是歷史的遺跡。

> Bigger and more random dataset would be awesome, if someone could/produce
> them.
Indeed.

> Big5-UAO的content除了Firefox還有別的瀏覽器能顯示嗎?需要安裝特別的字體嗎?
If one installed a big5-uao patch (community made) for windows, he/she
can view those contents under all software utilize the system
conversion table under windows. But Firefox uses its own conversion
table so the big5-uao merge is needed. AFAIK, those big5 variants are
covered by the 細明體 "MingLiu" system font under windows, so no special
font needed. This is not the same case for HK extension as those
glyphs were not in unicode yet at that time.

> Merging Big5 and Big5-HKSCS is not a goal in itself, but we must decide what
> mapping <meta charset="big5"> should use. Is there any mapping that would
> fix more pages than the one I've proposed?
For HK related contents, it's either you specify "big5-hkscs" in
Firefox/Chrome/Opera encoding, or install the patch from HK government
to view in IE. I don't see the need of merging big5 and big5-hkscs.
Also from your survey, big5-hkscs works best for HK related contents.
To me, having two major big5 variants, big5-uao and big5-hkscs, is the
best solution.

>>> 此外,該標準主要是針對瀏覽器,所以不會直接影響Web之外的Big5用法。
>> All our discussions are surely for web pages and browsers.
> Right, I assumed that "Telnet BBS" was not exposed to browsers, but perhaps
> I guess that was in reference to ptt.cc?

> 這裡有沒有多一點資料供阿菲參考呢?對我自己比較有影響的是 PTT C_Chat 版的
> Web 存檔,舉個例子:
> 但是 Web 存檔就是 Web 存檔,老實說沒有也沒有關係(但是我碰到這個存檔的機
> 率還是比任何香港網站都大很多),有沒有人有其他有在用 'big5-uao' 的網站的
> 連結?

There are many college BBS system, which still using big5 instead of
unicode, providing web interface, not just ptt.cc. Just that I
personally didn't touch them for long.

> 另外,有沒有人知道台灣有多少比例的人有裝 Unicode 補完?

> 2. 在 Windows 下為什麼要裝這個 package 而不是 Unicode 補完?這樣 Firefox
> 的 "big5" 還會是最好嗎?
> 3. 所以 Windows 下現在裝 Big5-HKSCS packgage 的人跟 Unicode 補完的哪一個多?
I really don't know. PieTTY and PCMan bbs software also has UAO
builtin. Can I count all people using Firefox and PCMan?
http://forum.moztw.org/viewtopic.php?f=11&t=30982

-- 
Best regards,
Yuan Chao
Received on Friday, 13 April 2012 04:24:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:43:50 UTC