Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

From: 신정식 <jshin1987@gmail.com>
Date: Wed, 9 Mar 2011 22:44:59 -0800
To: Soonbo Han <soonbo.han@lge.com>
Cc: public-html-ig-ko@w3.org, public-i18n-cjk@w3.org
On Wed, Mar 9, 2011 at 5:24 PM, Soonbo Han <soonbo.han@lge.com> wrote:

> Hello,
> Although I already told this to Koji in person, I'm sending this email for
> KIG members to encourage further discussion or comments.
> In my understanding, the codes for CHOSEONG, JUNGSEONG, and JONGSEONG are
> for conjoining to form a letter so that they are not used to display by
> themselves. Each Jamo (Korean alphabets - each consonant or vowel) can be
> displayed by using other code categorized as LETTER. For example, there is a
> LETTER A for displaying JUNGSEONG A.
> Since LETTERs are W(ide), I think this issue is not a problem anymore to
> display JUNGSEONG(vowel) itself in vertical mode. However, I'm not an expert
> in this area, any comment from KIG members will be welcomed.

Well, the number of medial vowels in the Korean conjoining letter block
(U+1161 .. U+11A7 and U+D7B0 .. U+D7C6) is much larger than the number of
vowels in the Korean compatibility letter block (U+31xx). So, I'm afraid
what you wrote does not hold well.

Anyway, I can't think of any good reason to rotate some Korean conjoining
letters while not rotating others when they stand by themselves in a
vertical layout.

I've just had a close look at EastAsianWidth.txt and I found something more
interesting / inconsistent.

U+1100..U+115F (leading consonants and leading consonant fillers) : W
U+A960..U+A97C (leading consonants) : W

U+1160..U+11A2 (medial vowel filler and medial vowels) : N
U+11A3..U+11A7 (medial vowels) : W   : Unicode 5.2
U+D7B0 .. U+D7C6 (medial vowels):  : W  : Unicode 5.2

U+11A8.. U+11F5 (final consonants) : N
U+11F6..U+11F9 (final consonants): W  : Unicode 1.1
U+11FA..U+11FF (final consonants): W : Unicode 5.2
U+D7CB..U+D7FB (final consonants) : W

As summarized above, some medial vowels are N while other medial vowels are
W. The same can be said of final consonants.

Initially, I thought the inconsistency was introduced when a new set of
consonants and vowels were added in Unicode 5.2. However, that does not seem
to be the case, either because U+11F6 .. U+11F9 (encoded in 1.1) have 'W'
while U+11A8 .. U+11F5 (also encoded in 1.1) have 'N'.

If all the leading consonants are 'W' while all the medial vowels and final
consonants are 'N',  it's at least consistent.

However, as shown above, that's not the case, either. So, I'm even more
confused as to why they're different. Is this a conscious decision (based on
the shape of each character)  or just an oversight?

Hmm, the file is automatically generated out of the Unicode character
database. Now, I'm getting more curious as to what character properties are
different between U+11A2 and U+11A3 (both medial vowels, the former with N
and the latter with W).

I'll raise this issue (inconsistency) using the feedback form at
unicode.orgso that it can be taken a look at by the UTC. (well, Asmus
is right here in
the thread :-))

Having written the above, I went onto read UAX #11 (East Asian Width) and
realized that 'N' in EastAsianWidth.txt does not mean 'Narrow' but means
"Neutral". That means that  the recommendation in UAX #11 about the rotation
in a vertical layout does not apply to Korean medial vowels and final
consonants with 'N' (neutral).

That is, even if we follow the recommendation in UAX #11, there's no reason
to rotate them (vowels and final consonants when they stand by themselves in
a vertical layout).


> >
