Re: [css-text] Justifying Korean text

On 07/07/2014 07:58 PM, Koji Ishii wrote:
>
> The challenge in this case is that, you will not be able to justify
> type #1 [ideographic] documents, because text-justify does not have
> a value to expand between ideographic characters. If you want to
> solve this, you have following options:
>
> 1. Mark such documents as lang=“zh” (Chinese.) I’m not sure how right
>    or wrong this is to you; are ancient documents considered as Chinese,
>    or are they ancient Korean? I’m guessing this is wrong, but just
>    wanted to ask. I’m sorry if this is really a bad, impolite question,
>    I hope you understand that I’m just trying to list up all technically
>    possible options here.
> 2. Propose CSS WG to revive “inter-ideograph” value, so that you can
>    mark as lang=“ko” and optionally expand between ideographic characters.
> 3. Make “expand between ideographic and Hangul characters” default, and
>    always use “inter-word” for type #2/#3 documents. This give you a
>    choice, but as a cost, you have to mark all type #2/#3 documents as
>    “inter-word”. I’m guessing the cost does not worth the value here?
> 4. Such documents are rare, justifying such documents are even rare to
>    zero, so don’t need to fix this specific case (please consider Q2/Q3
>    above.)
>
> Q5. Which option looks right to you, or anything else?

I think that if there are no spaces (or too few spaces) in a line,
inter-ideographic spacing should be allowed unless text-justify is
explicitly set to inter-word. That would take care of this case.
So, I suggest this as Option 5 for Q5.

> Next. This is harder one; when language is not specified. I suspect a
> large number of existing documents do not have lang, so this might
> affect backward compatibility more than Q5 does. I have to say that,
> in this case, there’s no single right solution because all existing
> browsers behave differently; we need to come up with some compromised,
> good enough behavior.
>
> In this case, Chinese and Japanese documents want to expand between
> ideographic characters, while Korean type #2 documents do not, so
> there’s a conflict. I don’t know how to properly resolve this conflict,
> [...]
>
> Q6. What do you think about this?

This is imho the difficult question. :)

> I’m guessing we should take Chinese and Japanese documents because they
> use justification more often, and the use of ideographic characters in
> Korea is not the primary use, but this is my personal opinion. Others
> might think differently, and answers to Q2/Q3 may also affect this.

I don't think this is true in print documents. I took photos of a sample
in a bookshop in Seoul, and except for some poems, it was all justified.

> Next. Let’s assume we took Chinese and Japanese (expand between
> ideographic characters) in Q6. In this case:
>
> Q7. Do you want a) to expand between Hangul because Hangul and
> ideographic should behave the same way for type #2 documents, or
> b) not to expand between Hangul because doing so helps type #3
> documents, even if it’s strange for type #2 documents?
>
> Note that all browsers today do not expand between Hangul, even when
> they expand between ideographic characters. I have no idea how strange
> this behavior is to you, especially when thinking type #2 documents.
> In case you’re interested in seeing my investigation result of existing
> browser behaviors, here it is[4]. It’s primarily my own memo, quite
> terse and maybe hard to understand though.

I think that Hangul and ideographic characters should be treated
the same for justification when mixed within a line. If you are
unsure of this, please look at any Japanese documents that contain
Hangul. I am sure you will see that they do not behave differently.
I expect the same is true in Korean documents.

The problem we are having here is the need for a compromise (Q6).
It seems some browsers chose to treat Hangul as not-expandable,
in order to force Korean texts to behave as inter-word, like Latin.
But this gives bad results for mixed-script text. It is a heuristic
solution to problem Q6, but it is not correct to treat them
differently: in Korean, it can create the appearance of spaces where
there should be none (because there may be gaps between adjacent
Hangul and Hanja), and in Japanese and Chinese it creates uneven
justification (because Hangul phrases will be set solid in lines
where Kanji and Kana are spaced apart).

Another suggestion for compromise is c) expand spaces at first
priority, expand inter-CJK at second priority. Depending on what
weights/limits are chosen between the priorities, this could result
in wider spaces than is wanted in Japanese documents with embedded
Latin or more inter-Hangul spacing than is wanted in particularly
short lines of Korean, but in average documents might be OK.

~fantasai

Received on Wednesday, 9 July 2014 15:35:32 UTC