Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

On Wed, Mar 9, 2011 at 5:24 PM, Soonbo Han <soonbo.han@lge.com> wrote:

> Hello,
>
> Although I already told this to Koji in person, I'm sending this email for
> KIG members to encourage further discussion or comments.
>
> In my understanding, the codes for CHOSEONG, JUNGSEONG, and JONGSEONG are
> for conjoining to form a letter so that they are not used to display by
> themselves. Each Jamo (Korean alphabets - each consonant or vowel) can be
> displayed by using other code categorized as LETTER. For example, there is a
> LETTER A for displaying JUNGSEONG A.
>
> 1161; N # HANGUL JUNGSEONG A
> 314F;W # HANGUL LETTER A
>
> Since LETTERs are W(ide), I think this issue is not a problem anymore to
> display JUNGSEONG(vowel) itself in vertical mode. However, I'm not an expert
> in this area, any comment from KIG members will be welcomed.
>

Well, the number of medial vowels in the Korean conjoining letter block
(U+1161 .. U+11A7 and U+D7B0 .. U+D7C6) is much larger than the number of
vowels in the Korean compatibility letter block (U+31xx). So, I'm afraid
what you wrote does not hold well.

Anyway, I can't think of any good reason to rotate some Korean conjoining
letters while not rotating others when they stand by themselves in a
vertical layout.

I've just had a close look at EastAsianWidth.txt and I found something more
interesting / inconsistent.

U+1100..U+115F (leading consonants and leading consonant fillers) : W
U+A960..U+A97C (leading consonants) : W

U+1160..U+11A2 (medial vowel filler and medial vowels) : N
U+11A3..U+11A7 (medial vowels) : W   : Unicode 5.2
U+D7B0 .. U+D7C6 (medial vowels):  : W  : Unicode 5.2

U+11A8.. U+11F5 (final consonants) : N
U+11F6..U+11F9 (final consonants): W  : Unicode 1.1
U+11FA..U+11FF (final consonants): W : Unicode 5.2
U+D7CB..U+D7FB (final consonants) : W

As summarized above, some medial vowels are N while other medial vowels are
W. The same can be said of final consonants.

Initially, I thought the inconsistency was introduced when a new set of
consonants and vowels were added in Unicode 5.2. However, that does not seem
to be the case, either because U+11F6 .. U+11F9 (encoded in 1.1) have 'W'
while U+11A8 .. U+11F5 (also encoded in 1.1) have 'N'.

If all the leading consonants are 'W' while all the medial vowels and final
consonants are 'N',  it's at least consistent.

However, as shown above, that's not the case, either. So, I'm even more
confused as to why they're different. Is this a conscious decision (based on
the shape of each character)  or just an oversight?

Hmm, the file is automatically generated out of the Unicode character
database. Now, I'm getting more curious as to what character properties are
different between U+11A2 and U+11A3 (both medial vowels, the former with N
and the latter with W).

I'll raise this issue (inconsistency) using the feedback form at
unicode.orgso that it can be taken a look at by the UTC. (well, Asmus
is right here in
the thread :-))

Having written the above, I went onto read UAX #11 (East Asian Width) and
realized that 'N' in EastAsianWidth.txt does not mean 'Narrow' but means
"Neutral". That means that  the recommendation in UAX #11 about the rotation
in a vertical layout does not apply to Korean medial vowels and final
consonants with 'N' (neutral).

That is, even if we follow the recommendation in UAX #11, there's no reason
to rotate them (vowels and final consonants when they stand by themselves in
a vertical layout).

Jungshik








>
> Regards,
> Soonbo Han
>
> -----Original Message-----
> From:
> Sent: 없음
> To: Koji Ishii
> Cc: public-html-ig-ko@w3.org; public-i18n-cjk@w3.org
> Subject: Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian
> Width
>
> On 3/9/2011 3:09 PM, Koji Ishii wrote:
> > Hello,
> >
> > Will you mind to help me to resolve a question in CSS3 Writing Modes
> spec?
> >
> > I'm trying to figure out which characters are displayed upright and which
> are rotated sideways in vertical text flow. I understand vertical text flow
> isn't very important for Hangul, but I hope you understand I want to write
> the correct spec in case you need it.
> >
> > Current idea is written in the spec[1], paragraphs after Figure 10. The
> basic idea is to use a combination of font information, Unicode Script
> Property[2], and Unicode East Asian Width[3].
> >
> > EAW (Unicode East Asian Width) defines character orientation like this in
> its Recommendation section[4]:
> > * Wide characters ... are not rotated (and therefore rendered upright)
> when appearing in vertical text runs.
> > * Narrow characters ... are rotated sideways, when appearing in vertical
> text.
> >
> > If I look into the data file[5], most Hangul characters are W(ide), so
> they are rendered upright in vertical text flow according to the Unicode
> definitions. I suppose this is what you expect.
> >
> > However, many of HANGUL JONGSEONG are marked as N and therefore they must
> be rotated sideways in vertical text flow if we follow this rule.
> >
> > 115F;W # HANGUL CHOSEONG FILLER
> > 1160;N # HANGUL JUNGSEONG FILLER
> > 1161;N # HANGUL JUNGSEONG A
> > 1162;N # HANGUL JUNGSEONG AE
> > 1163;N # HANGUL JUNGSEONG YA
> > ...
> >
> > I'm guessing this is NOT what you expect. Can anyone in this ML help me
> to resolve this situation? Possible answers I'm guessing are:
> >
>
>
> The characters in question are conjoining Jamos. They are supposed to
> form into syllables, which themselves are rendered upright in vertical
> writing.
>
> The question is whether anyone ever renders these things as themselves,
> that is when not combined into syllables and whether in that case they
> are upright when (if ever) they are vertical.
>
> Whatever the outcome, option 2 seems least desirable, because of the way
> EAW is defined.
>
>
> > 1. Unicode EAW is correct; these code points should be rotated sideways
> in vertical text flow.
> > 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> > 3. There are reasons to make these code points as "N", so EAW is correct,
> but "Narrow are rotated sideways" is incorrect.
> >
> > Which one is it, or anything else? I asked this to Soonbo Han from LG at
> CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 or 3 or
> else.
> >
> > Your support is greatly appreciated.
> >
> >
> > Regards,
> > Koji
> >
> > [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation
> > [2] http://unicode.org/reports/tr24/
> > [3] http://unicode.org/reports/tr11/
> > [4] http://unicode.org/reports/tr11/#Recommendations
> > [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
> >
> >
>
>
>
>
>

Received on Thursday, 10 March 2011 06:45:33 UTC