- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Mon, 27 Aug 2012 20:48:38 +0200
- To: Koji Ishii <kojiishi@gluesoft.co.jp>
- Cc: 'Glenn Adams' <glenn@skynav.com>, W3C Style <www-style@w3.org>, "public-i18n-cjk@w3.org" <public-i18n-cjk@w3.org>
Koji Ishii, Mon, 27 Aug 2012 08:13:43 -0400: > If you have suggested wording, I can run it by fantasai to put into > the spec. W.r.t. the list of strictness recommendations, then a clarity problem occurs because e.g. "Japanese" sometimes refers to script but other times refers to "content language". The data is there, but it would be "nice" if it was *very* clear when the break behavior is linked to the character and when it is linked to knowledge about the language. To solve this problem, then rather than proposing better wording, I would propose to use a table rather than a list. For example, with a table, then you could "tag" whether the forbidden line breaks are related to 1. script/character alone (e.g. before Japanese small kana) 2. combination of "common character" and Japanese and/or Chinese content language (applies to e.g. hyphen ‐ U+2010') 3. combination of 'CJK codepoint' and Japanese and/or Chinese content language (applies e.g. to FULLWIDTH TILDE ~ U+301C) Part of the current unclarity is linked to the use of the term "CJK". This term does not seem to be described anywhere. My understanding is that fullwidth characters falls under the CJK umbrella, and I suspect that this is also the case for the spec text. At the same time, the paragraph beneath the list of recommendations fails to specify that even fullwidth characters needs to be declared to be in Japanese (or Chinese) before the distinctions in the recommendations apply. (See below.) The paragraph beneath the list of recommendations, tries to summarize the situation, but in my view uses a few unlucky formulations: ]] In the recommended list above, no distinction is made among the levels of strictness in non-CJK text: only CJK codepoints are affected, unless the text is marked as Chinese or Japanese, in which case some additional common codepoints are affected. However a future level of CSS may add behaviors affecting non-CJK text. [[ Problems: * 'CJK' is undefined and especially 'CJK' vs 'common codepoins' is not defined. I suspect the text to see some codepoints that I see as common code points as CJK codepoints. (E.g. the hyphen.) * W.r.t. 'only CJK codepoints are affected': is Korean affected? (That question may reveal my CJK un-familiarity - sorry …) * The sentence which includes the phrase 'unless … marked as Chinese or Japanese, in which some additional common code points are affected', disguises the fact that some 'CJK' characters, such as FULLWIDTH TILDE, has to be known to be of Japanese/Chinese content language before the recommendations apply. * the phrase "marked as" should perhaps be replaced by "is known to be of content language", to be congruent with the rest? But may be if you offer a clear table, as suggested above, then you can make the explanative paragraph much shorter?! > [1] http://www.w3.org/TR/css3-selectors/#lang-pseudo > [2] http://dev.w3.org/csswg/css3-text/#content-language > > Regards, > Koji -- Leif Halvard Silli
Received on Monday, 27 August 2012 18:49:16 UTC