W3C home > Mailing lists > Public > public-html-ig-ko@w3.org > March 2011

RE: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Thu, 10 Mar 2011 01:32:53 -0500
To: Asmus Freytag <asmusf@ix.netcom.com>
CC: "public-html-ig-ko@w3.org" <public-html-ig-ko@w3.org>, "public-i18n-cjk@w3.org" <public-i18n-cjk@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0AB201DE6F@MAILR001.mail.lan>
Asmus, thank you for the response.

>> 1. Unicode EAW is correct; these code points should be rotated sideways in vertical text flow.
>> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
>> 3. There are reasons to make these code points as "N", so EAW is correct,
>>    but "Narrow are rotated sideways" is incorrect.
>
> Whatever the outcome, option 2 seems least desirable, because of the way EAW is defined.

Can you please shed me a light on why you think so?

Soonbo Han told me that 1 is wrong. I also verified that with MS Word; it displays U+1161 upright. I tried to enter:
U+1100 HANGUL CHOSEONG KIYEOK
U+1161 HANGUL JUNGSEONG A
and then made the document to vertical text flow. To enter these two code points into MS Word, type "1100" (without double quotes), Alt+x, "1161", and Alt+x.

Given this, it looks like we can exclude first two options, but I'm curious to know what the reasons behind to exclude option 2.

According to "4 Definitions" section of UAX #11 East Asian Width[1], "N" is defined as "Neutral (Not East Asian): All other characters. Neutral characters do not occur in legacy East Asian character sets" which looks to be wrong for this case. If you think it's not "W" for whatever reasons, it should be "Na" (Narrow) instead.

Thanks to Soonbo Han, I now understand these code points are part of conjoining. I also verified this with UAX #29 Unicode Text Segmentation[2] that U+1161 is V (Vowel) while U+314F is a grapheme cluster.

So I would add another option (as Soonbo Han sent by replying to the original mail):

4. It's always 2nd or 3rd code point of a grapheme cluster, and 1st code point (CHOSEONG) is "W", so the UA should assume the grapheme cluster is "W" by disregarding the EAW of 2nd/3rd code point.

This is a possible interpretation of EAW, but it's not clearly written in the spec, so I might try to ask Unicode folks to add this description if this is the correct way to go. But even so, theoretically author could type U+1161 alone. MS Word displays it in upright in vertical text flow, so this option may not be as good as what MS Word does.

One more possible option is:

5. U+1161 is there for historical reasons or for backward compatibility and nobody uses this in real world, so nobody cares whether it's "W" or "N" or whatever else.

I don't like this option very much, but I can live with if this is what Korean users want.


[1] http://www.unicode.org/reports/tr11/#Definitions

[2] http://www.unicode.org/reports/tr29/


-----Original Message-----
From: Asmus Freytag [mailto:asmusf@ix.netcom.com] 
Sent: Wednesday, March 09, 2011 3:54 PM
To: Koji Ishii
Cc: public-html-ig-ko@w3.org; public-i18n-cjk@w3.org
Subject: Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

On 3/9/2011 3:09 PM, Koji Ishii wrote:
> Hello,
>
> Will you mind to help me to resolve a question in CSS3 Writing Modes spec?
>
> I'm trying to figure out which characters are displayed upright and which are rotated sideways in vertical text flow. I understand vertical text flow isn't very important for Hangul, but I hope you understand I want to write the correct spec in case you need it.
>
> Current idea is written in the spec[1], paragraphs after Figure 10. The basic idea is to use a combination of font information, Unicode Script Property[2], and Unicode East Asian Width[3].
>
> EAW (Unicode East Asian Width) defines character orientation like this in its Recommendation section[4]:
> * Wide characters ... are not rotated (and therefore rendered upright) when appearing in vertical text runs.
> * Narrow characters ... are rotated sideways, when appearing in vertical text.
>
> If I look into the data file[5], most Hangul characters are W(ide), so they are rendered upright in vertical text flow according to the Unicode definitions. I suppose this is what you expect.
>
> However, many of HANGUL JONGSEONG are marked as N and therefore they must be rotated sideways in vertical text flow if we follow this rule.
>
> 115F;W # HANGUL CHOSEONG FILLER
> 1160;N # HANGUL JUNGSEONG FILLER
> 1161;N # HANGUL JUNGSEONG A
> 1162;N # HANGUL JUNGSEONG AE
> 1163;N # HANGUL JUNGSEONG YA
> ...
>
> I'm guessing this is NOT what you expect. Can anyone in this ML help me to resolve this situation? Possible answers I'm guessing are:
>


The characters in question are conjoining Jamos. They are supposed to form into syllables, which themselves are rendered upright in vertical writing.

The question is whether anyone ever renders these things as themselves, that is when not combined into syllables and whether in that case they are upright when (if ever) they are vertical.

Whatever the outcome, option 2 seems least desirable, because of the way EAW is defined.


> 1. Unicode EAW is correct; these code points should be rotated sideways in vertical text flow.
> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> 3. There are reasons to make these code points as "N", so EAW is correct, but "Narrow are rotated sideways" is incorrect.
>
> Which one is it, or anything else? I asked this to Soonbo Han from LG at CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 or 3 or else.
>
> Your support is greatly appreciated.
>
>
> Regards,
> Koji
>
> [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation

> [2] http://unicode.org/reports/tr24/

> [3] http://unicode.org/reports/tr11/

> [4] http://unicode.org/reports/tr11/#Recommendations

> [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt

>
>

Received on Thursday, 10 March 2011 06:37:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 10 March 2011 06:37:56 GMT