RE: Solutions to unify middle dot usage in Traditional Chinese

Addison,

Using U+FF0E as a middle dot is a Very Bad Idea¢â, because it is a full-width full stop (period) that happens to be centered according to Traditional Chinese conventions.

U+30FB is also not good due to its heavy Japanese connections.

Regards...

-- Ken

-----Original Message-----
From: Phillips, Addison [addison@lab126.com]
Received: Wednesday, 10 Dec 2014, 9:57
To: Bobby Tung [bobbytung@wanderer.tw]; public-zhreq@w3.org [public-zhreq@w3.org]
CC: CJK discussion [public-i18n-cjk@w3.org]; ñéÙþHTML5ÔÒäÅüåML [public-html-ig-zh@w3.org]; Ken Lunde [lunde@adobe.com]
Subject: RE: Solutions to unify middle dot usage in Traditional Chinese

Hi Bobby,

I would think that U+30FB would never be appropriate as a middle dot in Chinese (even though it is sometimes used, possibly because legacy fonts display it in a more graceful manner than the Latin or full width dots).

I tend to agree about using U+00B7 as middle dot with Traditional Chinese text presenting it as full width in most contexts. What you don¡¯t mention is whether Simplified Chinese prefers U+00B7 to be halfwidth. Certainly most Latin script fonts will have U+00B7 as proportional (and thus not full width). That makes the text layout of undifferentiated Hanzi text complicated (don¡¯t know which presentation of middle dot to use).

What is the allergy to using FF0E?

Addison

From: Bobby Tung [mailto:bobbytung@wanderer.tw]
Sent: Wednesday, December 10, 2014 6:23 AM
To: public-zhreq@w3.org
Cc: CJK discussion; ñéÙþHTML5ÔÒäÅüåML; Ken Lunde
Subject: Solutions to unify middle dot usage in Traditional Chinese

Hello,

There's a problem I found about the middle dot usage in Traditional Chinese.

--Usage

Middle dot for Traditional Chinese has 3 usages list below:

1, separates translated latin name in Hanzi, e.g. ×â??à´ï£

2, as decimal point in Hanzi e.g. ß²?ìéÞÌ

3, separates book, chapter, title e.g.  ãÌÌè?êàù¦?àµà©

In Traditional Chinese, the Middle dot should be full-width and a filled round dot in the middle.

--Codepoint

There's some codepoints general used for the middle dot in Traditional Chinese.

¡¤         U+00B7          MIDDLE DOT
?        U+2027           HYPHENATION POINT
?      U+30FB          KATAKANA MIDDLE DOT
£®      U+FF0E          FULLWIDTH FULL STOP

And in Simplified Chinese usage, the middle dot is U+00B7.

U+00B7 from A150 and U+2027 from A145 on BIG 5 code table[1].

But I think U+00B7's definition more suitable for the middle dot than U+2027 / U+FF0E.

--Solutions

Considering about interoperability and codepoint definition, I have 2 proposals.

1. use U+00B7 as general middle dot, if authors want to let it full-width, use U+30FB. But most Chinese fonts do not have the glyph, certainly fallback to Japanese font. [2]

2. use U+00B7 as general middle dot, and in Traditional Chinese subset, let glyph be full-width.


=====


ÊÀêÈ£¬ä²Û¡úÞÛåô÷í®îÜñéïÇî¤ÞÅéÄß¾ßÓÓ×ûèÕ¯£¬ßÌí¾ÞÐñéÙþÛÉ÷úâÍÏ´ãÁ÷êøöñÞïôù»ÕΣ¬ð«õóÕ×ËÁÛ°äС£

à»ð«õóÛåô÷í®¡¸Ö§ïÈûÜ¡¹£¨ÏÁöàëåï½ûÜ£©ÞÅéÄîÜßÒüÏ£º

1, éÄÕÎÝÂÌ°ùÓæ»àóæ¨Ù££¬ÖÇåý£º×â??à´ï£

2, íÂ?ùÓí®â¦í®îÜá³â¦ïÇ£¬ÖÇåý£ºß²?ìéÞÌ

3, éÄÕÎÝÂÌ°ßö¡¢íñ¡¢íÂù¡Ù££¬ÖÇåý£ºãÌÌè?êàù¦?àµà©

ì»î¤Ûåô÷í®îÜéÄÛöß¾£¬Ö§ïÈûÜëëú±?îïû¡/îïÊÇ£¬?öÇñéîÜãùãýïÇ¡£

î¢ÕÎðôãùð·îÜÙþËìß¾£¬üåÛ¡úÞêóõÌßÈÞÅéÄîÜÞÌËÁCodepoints£º

¡¤         U+00B7          MIDDLE DOT
?        U+2027           HYPHENATION POINT
?      U+30FB          KATAKANA MIDDLE DOT
£®      U+FF0E          FULLWIDTH FULL STOP

ÊÛô÷í®öÎãÀ÷ÖìéÞÅéÄU+00B7£¬ì»U+00B7ÕÎí»BIG 5îÜA150£¬Ó£ä²ìã?U+00B7îÜïÒëùÝïÎòݬùêÞÅéÄßÒüÏ£¬á¶ì¤ÜôÍÅÕçÞÅéÄU+2027æ¨U+FF0E¡£

á¶ì¤ð«õóîÜÛ°äÐåýù»£º

1, ÞÅéÄU+00B7íÂ?øöñÞñéïÇ£¬å´íÂíºßÌé©îïû¡£¬öÎÞÅéÄU+30FB£¬Ó£ì×?îÏËÁCodepointúÉÒýñéÙþí®úþÙÒêóð㣬á¶ì¤ÐúûºìéïÒüåFallbackÓðìíÙþí®úþ¡£

2, ÞÅéÄU+00B7íÂ?øöñÞñéïÇ£¬Ó£î¤Ûåô÷í®í®úþñ飬íâÐìðã?îïû¡¡£


[1]: http://www.khngai.com/chinese/charmap/tblbig.php?page=0

[2]: http://www.unicode.org/reports/tr11/



WANDERER Digital Publishing Inc.
Bobby Tung @bobtung
Mobile£º+886-975068558
bobbytung@wanderer.tw<mailto:bobbytung@wanderer.tw>
http://wanderer.tw

Received on Wednesday, 10 December 2014 18:27:48 UTC