Re: [css-text] Universal Compromise Default Justification from Koji Ishii on 2014-07-26 (public-i18n-cjk@w3.org from July to September 2014)

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Sat, 26 Jul 2014 19:46:46 +0000
To: fantasai <fantasai.lists@inkedblade.net>
CC: "www-style@w3.org" <www-style@w3.org>, CJK discussion <public-i18n-cjk@w3.org>, WWW International <www-international@w3.org>
Message-ID: <A35E7D9F-3915-43FF-9760-5F3CC8424D64@gluesoft.co.jp>

As you already said by yourself, a good value for the AmountX is hard—or, I actually think it’s not possible.

I’m not very familiar with cluster scripts, but even within block scripts, max(0.5em, character’s own advance width) is good for Chinese/Japanese but is too small for Korean. If we make it larger, it’ll be good for Korean, but bad for Chinese and Japanese.

I know you do not like the idea to handle Hangul and ideographic differently, but given there are no browsers today that expands between Hangul (except when inter-ideograph is applied to IE,) I don’t think we should change this behavior.

Between ideographic has different story. WebKit/Blink expands them. Gecko doesn’t (only when lang=ja|zh,) and IE doesn’t (only when inter-ideograph.) If we change it not to expand, documents that used to work for WebKit/Blink will break. If we change it to expand, Gecko and IE will break. But the break of expand is good for Chinese/Japanese, and is bad for 20% of Korean, and English etc. From these facts, it looks to me that expanding is the right thing to do, but I’m afraid that I’m biased by the fact that I’m Japanese. Please correct me if I look biased.

If I understand correctly, fantasai’s proposal is save everyone, but ask everyone to be worse than today. I think we should take majority not to be worse than today, and accept the rest will be worse than today. Which “worse” is even worse, I’m not sure, but given they can get the same or better by adding lang tag or appropriate text-justify values, I think keeping majority is more important.

Last time we talked, I didn’t have these data (such as which browser do what, and 20% of Korean doc are mix of Hangul/ideographic.) Does these data affect your thoughts?

/koji

On Jul 24, 2014, at 1:12 AM, fantasai <fantasai.lists@inkedblade.net> wrote:

> One of the tasks we have in CSS3 Text is to provide an example
> of a justification algorithm that is simple and i18n-aware and
> can be applied to untagged text. (We recommend using more
> specialized knowledge when the language is known.)
> 
> Here is a proposal for such an algorithm. It does not give ideal
> results, but it should be acceptable in the most common cases.
> 
>  0. Contract word separators / trimmable punctuation
>     if possible and adequate, within limits. (Optional)
>  1. Expand word separators up to AmountX.
>  2. If there's still more space to distribute,
>     expand word separators together with inter-character
>     spacing of block scripts, up to an additional AmountX.
>  3. If there's still more space to distribute,
>     expand word separators and tsek marks together with
>     inter-character spacing of block and cluster scripts.
> 
> block scripts = Han, Hangul, Kana, Yi, etc.
> cluster scripts = Thai, Lao, etc.
> AmountX = some possibilities listed below:
>  a. min(0.25em, 100% width of U+0020) + character's own advance width
>  b. max(0.5em, character’s own advance width)
> 
> For most space-separated scripts, this will be like inter-word.
> 
> For Korean, this will be mostly like inter-word, but in short
> lines with few spaces, it may cause some expansion between
> characters.
> 
> For Japanese and Chinese, this will be mostly like inter-character
> justification, but in lines with a few spaces (as can occur when
> Latin phrases are used inline), the spaces might become wider than
> is ideal, depending on the limits chosen. (JLREQ maxes spaces out
> at 0.5em total.)
> 
> For cluster scripts like Thai, this will result in wider spaces
> before inter-cluster justification kicks in, which afaik is
> appropriate for such scripts.
> 
> I think this will produce acceptable, though not optimal, results
> for all the scripts I know of.
> 
> Comments welcome.
> 
> ~fantasai
> 
>

Received on Saturday, 26 July 2014 19:47:23 UTC