RE: [css-text] Justifying Korean text

Hi! Koji.
Thanks a lot for your questions about Korean use cases!
Unfortunately I am too busy these days, so I will get back to you later, probably until next week.

Kr, Wonsuk.

> -----Original Message-----
> From: Koji Ishii [mailto:kojiishi@gluesoft.co.jp]
> Sent: Tuesday, July 08, 2014 11:59 AM
> To: public-html-ig-ko@w3.org; CJK discussion
> Subject: [css-text] Justifying Korean text
> 
> Hello/안녕하세요
> 
> Could someone please help us to discuss what’s right for justifying
> Korean text? This is a bit long e-mail, sorry for not being able to
> write in short.
> 
> Here’s a background. Last year, the CSS WG discussed on the text-justify
> property[1] and made a few resolutions. The full resolutions are here[2],
> but in summary:
> 1. Make justification behavior as automatic to the content language[3]
> as possible, and remove as much behavior-specific values as possible.
> 2. With that, “inter-ideograph” value (to expand between ideographic
> characters) was removed, but “inter-word” value (not to expand between
> ideographic characters) is still in.
> 
> In this context, I’m having difficulty to come up with what’s good for
> Korean text.
> 
> In my understanding, there are 3 types of Korean documents:
> 
> 1. Ideographic only, ancient documents (may sometimes contain some
> hangul characters.) 2. Mostly Hangul, a few to some ideographic
> characters per a paragraph or a page.
> 3. All Hangul, no ideographic characters.
> 
> Q1. Is this understanding correct, or do I miss any other types?
> 
> I do not have a good sense of how many each documents are, so here’s the
> first question.
> 
> Q2. Can you give us the ratio of each type of documents on the web? I
> mean, ratios such as “0:40:60”. Any statistics would be great, but your
> own ratio as you feel is also helpful; if 10 people respond my-own-ratio,
> it’s a sort of statistics I suppose.
> Q3. Is the ratio for papers/books/e-books different from the ratio for
> the web documents? How about TV/movie captions, signage, or anywhere
> else where web platform is used?
> 
> Next, let’s think about when author sets lang=“ko” to the document (and
> text-align:justify of course.) This case is easier because we can focus
> on what’s right for Korean. In this case, in my understanding, you want
> to expand only at spaces, correct? All existing browsers do not expand
> between Hangul, I suppose this is the correct behavior. However,
> Chrome/Safari expands between ideographic characters, I’m guessing this
> is not an expected behavior for type #2 documents and you want to fix
> this.
> 
> Q4. Is the assumption above correct?
> 
> The challenge in this case is that, you will not be able to justify type
> #1 documents, because text-justify does not have a value to expand
> between ideographic characters. If you want to solve this, you have
> following options:
> 
> 1. Mark such documents as lang=“zh” (Chinese.) I’m not sure how right or
> wrong this is to you; are ancient documents considered as Chinese, or
> are they ancient Korean? I’m guessing this is wrong, but just wanted to
> ask. I’m sorry if this is really a bad, impolite question, I hope you
> understand that I’m just trying to list up all technically possible
> options here.
> 2. Propose CSS WG to revive “inter-ideograph” value, so that you can
> mark as lang=“ko” and optionally expand between ideographic characters.
> 3. Make “expand between ideographic and Hangul characters” default, and
> always use “inter-word” for type #2/#3 documents. This give you a choice,
> but as a cost, you have to mark all type #2/#3 documents as “inter-word”.
> I’m guessing the cost does not worth the value here?
> 4. Such documents are rare, justifying such documents are even rare to
> zero, so don’t need to fix this specific case (please consider Q2/Q3
> above.)
> 
> Q5. Which option looks right to you, or anything else?
> 
> Next. This is harder one; when language is not specified. I suspect a
> large number of existing documents do not have lang, so this might
> affect backward compatibility more than Q5 does. I have to say that, in
> this case, there’s no single right solution because all existing
> browsers behave differently; we need to come up with some compromised,
> good enough behavior.
> 
> In this case, Chinese and Japanese documents want to expand between
> ideographic characters, while Korean type #2 documents do not, so
> there’s a conflict. I don’t know how to properly resolve this conflict,
> I’m guessing we should take Chinese and Japanese documents because they
> use justification more often, and the use of ideographic characters in
> Korea is not the primary use, but this is my personal opinion. Others
> might think differently, and answers to Q2/Q3 may also affect this.
> 
> Q6. What do you think about this?
> 
> Next. Let’s assume we took Chinese and Japanese (expand between
> ideographic characters) in Q6. In this case:
> 
> Q7. Do you want a) to expand between Hangul because Hangul and
> ideographic should behave the same way for type #2 documents, or b) not
> to expand between Hangul because doing so helps type #3 documents, even
> if it’s strange for type #2 documents?
> 
> Note that all browsers today do not expand between Hangul, even when
> they expand between ideographic characters. I have no idea how strange
> this behavior is to you, especially when thinking type #2 documents. In
> case you’re interested in seeing my investigation result of existing
> browser behaviors, here it is[4]. It’s primarily my own memo, quite
> terse and maybe hard to understand though.
> 
> Lastly, this is not a question, but if you create justified Korean HTML
> documents today, I recommend you to add 1) lang=“ko” and 2) text-
> justify:inter-word. It’s hard to predict how the future will be, but
> from what I can tell you at this moment, this is considered as the best
> practice to protect your documents in future.
> 
> If you could answer only part of questions, it’s still helpful. Thank
> you for reading this long e-mail, and look forward to hearing from you.
> 
> [1] http://dev.w3.org/csswg/css-text/#text-justify-property
> [2] http://lists.w3.org/Archives/Public/www-style/2013Feb/0474.html
> [3] http://dev.w3.org/csswg/css-text/#content-language
> [4] http://1drv.ms/1r3iYme
> 
> /koji

Received on Wednesday, 16 July 2014 07:46:42 UTC