W3C home > Mailing lists > Public > public-i18n-cjk@w3.org > January to March 2011

RE: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Thu, 10 Mar 2011 02:18:50 -0500
To: Jungshik SHIN (신정식) <jshin1987@gmail.com>
CC: Asmus Freytag <asmusf@ix.netcom.com>, "public-html-ig-ko@w3.org" <public-html-ig-ko@w3.org>, "public-i18n-cjk@w3.org" <public-i18n-cjk@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0AB201DE71@MAILR001.mail.lan>
Thank you again, Jungshik.

As I wrote in my previous mail, I read UAX #11 saying that Neutral should also be rotated sideways. I think it’s clear form the statement and Figure 2, and interpreting so works well for other scripts, but if you don’t think so, it may be another issue for UAX #11 to write it clearer.

Just to make it clear, my personal vote from my wild guess is the option 2 (these code points should be “W”), so I’m glad you agree with this. I raised other options just because Asmus said that the option 2 is the least desired. That said, we don’t know the right answer yet other than the option 1 is wrong.

You’re probably right that it’s better to discuss this further in unicode.org, and if you can raise the issue, it’s greatly appreciated.

Thank you for the support to the CSS specs again.


Regards,
Koji

From: Jungshik SHIN (신정식) [mailto:jshin1987@gmail.com]
Sent: Wednesday, March 09, 2011 10:56 PM
To: Koji Ishii
Cc: Asmus Freytag; public-html-ig-ko@w3.org; public-i18n-cjk@w3.org
Subject: Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width


On Wed, Mar 9, 2011 at 10:32 PM, Koji Ishii <kojiishi@gluesoft.co.jp<mailto:kojiishi@gluesoft.co.jp>> wrote:
Asmus, thank you for the response.

>> 1. Unicode EAW is correct; these code points should be rotated sideways in vertical text flow.
>> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
>> 3. There are reasons to make these code points as "N", so EAW is correct,
>>    but "Narrow are rotated sideways" is incorrect.
>
> Whatever the outcome, option 2 seems least desirable, because of the way EAW is defined.
Can you please shed me a light on why you think so?

Soonbo Han told me that 1 is wrong. I also verified that with MS Word; it displays U+1161 upright. I tried to enter:
U+1100 HANGUL CHOSEONG KIYEOK
U+1161 HANGUL JUNGSEONG A
and then made the document to vertical text flow. To enter these two code points into MS Word, type "1100" (without double quotes), Alt+x, "1161", and Alt+x.

Given this, it looks like we can exclude first two options, but I'm curious to know what the reasons behind to exclude option 2.

According to "4 Definitions" section of UAX #11 East Asian Width[1], "N" is defined as "Neutral (Not East Asian): All other characters. Neutral characters do not occur in legacy East Asian character sets" which looks to be wrong for this case. If you think it's not "W" for whatever reasons, it should be "Na" (Narrow) instead.

Why do you think they should be "Na"? Then, if you follow the UAX #11 recommendation, they should be rotated when they stand alone in a vertical layout. That's *not* a desired behavior.



Thanks to Soonbo Han, I now understand these code points are part of conjoining. I also verified this with UAX #29 Unicode Text Segmentation[2] that U+1161 is V (Vowel) while U+314F is a grapheme cluster.

So I would add another option (as Soonbo Han sent by replying to the original mail):

4. It's always 2nd or 3rd code point of a grapheme cluster, and 1st code point (CHOSEONG) is "W", so the UA should assume the grapheme cluster is "W" by disregarding the EAW of 2nd/3rd code point.

Well, 'neutral' characters are not required to be rotated by UAX #11. It's not explicitly mentioned in the section on the display, but if 'neutral' is treated the same as 'ambiguous', their behavior is context dependent, which covers what you tried to do with option #4 and more. Note that a Korean syllable is defined as L+V+T* by the Unicode. V and T (vowels and final consonants) can be 4th or later components of a Korean syllable.

Moreover, #4 does not cover cases where Korean vowels and final consonants are by themselves in a vertical layout.

If Asmus does not like changing the EA assignment to Korean medial vowels and final consonants from Neutral to Wide, it seems that UAX #11 section 5 (a subsection on the display) has to be changed to special-case Korean vowels and final consonants.

However, we still have to resolve the inconsistency in the EA assignment of Korean vowels and final consonants that I wrote about in the previous email. Either all of them have to be Neutral or all of them have to be Wide.

Jungshik



This is a possible interpretation of EAW, but it's not clearly written in the spec, so I might try to ask Unicode folks to add this description if this is the correct way to go. But even so, theoretically author could type U+1161 alone. MS Word displays it in upright in vertical text flow, so this option may not be as good as what MS Word does.

One more possible option is:

5. U+1161 is there for historical reasons or for backward compatibility and nobody uses this in real world, so nobody cares whether it's "W" or "N" or whatever else.

I don't like this option very much, but I can live with if this is what Korean users want.


[1] http://www.unicode.org/reports/tr11/#Definitions

[2] http://www.unicode.org/reports/tr29/


-----Original Message-----
From: Asmus Freytag [mailto:asmusf@ix.netcom.com<mailto:asmusf@ix.netcom.com>]
Sent: Wednesday, March 09, 2011 3:54 PM
To: Koji Ishii
Cc: public-html-ig-ko@w3.org<mailto:public-html-ig-ko@w3.org>; public-i18n-cjk@w3.org<mailto:public-i18n-cjk@w3.org>
Subject: Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

On 3/9/2011 3:09 PM, Koji Ishii wrote:
> Hello,
>
> Will you mind to help me to resolve a question in CSS3 Writing Modes spec?
>
> I'm trying to figure out which characters are displayed upright and which are rotated sideways in vertical text flow. I understand vertical text flow isn't very important for Hangul, but I hope you understand I want to write the correct spec in case you need it.
>
> Current idea is written in the spec[1], paragraphs after Figure 10. The basic idea is to use a combination of font information, Unicode Script Property[2], and Unicode East Asian Width[3].
>
> EAW (Unicode East Asian Width) defines character orientation like this in its Recommendation section[4]:
> * Wide characters ... are not rotated (and therefore rendered upright) when appearing in vertical text runs.
> * Narrow characters ... are rotated sideways, when appearing in vertical text.
>
> If I look into the data file[5], most Hangul characters are W(ide), so they are rendered upright in vertical text flow according to the Unicode definitions. I suppose this is what you expect.
>
> However, many of HANGUL JONGSEONG are marked as N and therefore they must be rotated sideways in vertical text flow if we follow this rule.
>
> 115F;W # HANGUL CHOSEONG FILLER
> 1160;N # HANGUL JUNGSEONG FILLER
> 1161;N # HANGUL JUNGSEONG A
> 1162;N # HANGUL JUNGSEONG AE
> 1163;N # HANGUL JUNGSEONG YA
> ...
>
> I'm guessing this is NOT what you expect. Can anyone in this ML help me to resolve this situation? Possible answers I'm guessing are:
>


The characters in question are conjoining Jamos. They are supposed to form into syllables, which themselves are rendered upright in vertical writing.

The question is whether anyone ever renders these things as themselves, that is when not combined into syllables and whether in that case they are upright when (if ever) they are vertical.

Whatever the outcome, option 2 seems least desirable, because of the way EAW is defined.


> 1. Unicode EAW is correct; these code points should be rotated sideways in vertical text flow.
> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> 3. There are reasons to make these code points as "N", so EAW is correct, but "Narrow are rotated sideways" is incorrect.
>
> Which one is it, or anything else? I asked this to Soonbo Han from LG at CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 or 3 or else.
>
> Your support is greatly appreciated.
>
>
> Regards,
> Koji
>
> [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation

> [2] http://unicode.org/reports/tr24/

> [3] http://unicode.org/reports/tr11/

> [4] http://unicode.org/reports/tr11/#Recommendations

> [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt

>
>

Received on Thursday, 10 March 2011 08:02:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 10 March 2011 08:02:12 GMT