Re: Question on Text Justification of Korean

Please allow me to add a bit more from different perspective for why, though knowing it’s obviously really bad, we would like to understand “how much" bad.

First of all, I would like to re-emphasize that this is only when lang tag is missing. Authors can fix issues by just adding lang=“ko”. Also, this only applies when "text-align:justify", which, we don’t know how often used in Korean text; i.e., more often than in C/J or not. I’m guessing it’s more like Latin because words are longer in Korean, but don’t have data.

When lang tag is missing, we can still do the best for script-specific characters such as Kana or Hangul, but Han is unified and since lang tag is missing, we need to choose one behaviour: C/J, K, or somewhere in-between. If we choose C/J or K, the chosen one do not regress, but the other might. If we chose the compromised algorithm fantasai is proposing, everyone will regress, though the level of regression is different.

fantasai believes that doing bad Hangul-Han mixed justification is critically important to fix, critical enough to chose in-between, which sacrifice everyone by a little. I believe that it should be decided based on the number of documents/users/authors being regressed, as long as documents are readable, because I would like to fix my documents if they were broken, no matter “how badly” broken they were. We’re in the middle of this discussion to figure out the right way to go.

/koji

> On Oct 24, 2014, at 08:43, fantasai <fantasai.lists@inkedblade.net> wrote:
> 
> On 10/23/2014 05:50 PM, Jungshik SHIN (신정식) wrote:
>> 
>> Could you explain why treating Hangul and Han identically for the
>> justification hurts the justification quality of Hangul-only
>> documents (and Chinese and Japanese documents) ?
> 
> Okay, I will try to explain. :)
> 
> The constraint of the situation is that we do not know the primary
> language or writing system because the document is untagged. Given
> this, we must come up with a justification system that is adequate
> for all systems.
> 
> In order to adequately handle Japanese and Chinese, we must allow
> expansion between Han and Kana characters.
> 
> In order to adequately handle most other writing systems, we must
> allow expansion at spaces.
> 
> Korean is kindof a combination of both cases.
> 
> At least one implementation has decided to handle this situation by
> expanding at spaces, Han, and Kana, but not Hangul. For Hangul-only
> documents, this will expand only at spaces, and for Chinese/Japanese
> documents, this will expand among all characters. For these documents,
> everyone is happy. But for mixed Han + Hangul documents, this solution
> has the behavior we are discussing. [1]
> 
> An alternative solution we are considering is 2-tier justification:
> expand primarily at spaces, up to a limit, and then beyond that
> expand among all CJK. This will mean that in some cases where there
> is too much space to be absorbed by spaces, Korean will also expand
> somewhat between all characters. [2] It also means that for Chinese
> and Japanese, spaces will become wider than they prefer (since they
> prefer very little expansion at spaces). [3]
> 
> [1] http://dev.w3.org/csswg/css-text-3/justify?cjk=0&&splitHangul=splitHangul&&text=%ED%8A%B9%EB%B3%84%EC%8B%9C%28%EC%84%9C%EC%9A%B8%E7%89%B9%E5%88%A5%E5%B8%82%29%EB%8A%94%20%ED%95%9C%EB%B0%98%EB%8F%84

> [2] http://dev.w3.org/csswg/css-text-3/justify?cjk=1&&text=%EC%8B%9C%28%EC%84%9C%EC%9A%B8%E7%89%B9%E5%88%A5%E5%B8%82%29%EB%8A%94%20%ED%95%9C%EB%B0%98%EB%8F%84

> [3] http://dev.w3.org/csswg/css-text-3/justify?cjk=1&&text=%E3%81%93%E3%81%AEWii%20U%E3%81%8C%E5%A4%A7%E5%A5%BD%E3%81%8D%E3%81%A7%E3%81%99

> 
> The exact tradeoffs will vary by the limit chosen. The ideal limit
> for spaces in Japanese is around 0.5em-[width of space]; for Korean
> it's much higher, though I'm not sure it's actually infinite. :)
> 
> The algorithm here is simplistic and UAs are allowed to do better,
> but we were asked to come up with an example of something that
> works as a starting point.
> 
> I hope that's understandable.
> 
>> BTW, fantasai's question and examples given do not match each other, I'm afraid.
> 
> Ah, sorry, I should have said "expand around Han", not "expand
> between Hangul and Han". I lost that edit somewhere along the way.
> 
> ~fantasai

Received on Friday, 24 October 2014 06:10:04 UTC