Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

On Wed, Mar 9, 2011 at 10:32 PM, Koji Ishii <kojiishi@gluesoft.co.jp> wrote:

> Asmus, thank you for the response.
>
> >> 1. Unicode EAW is correct; these code points should be rotated sideways
> in vertical text flow.
> >> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> >> 3. There are reasons to make these code points as "N", so EAW is
> correct,
> >>    but "Narrow are rotated sideways" is incorrect.
> >
> > Whatever the outcome, option 2 seems least desirable, because of the way
> EAW is defined.
>
> Can you please shed me a light on why you think so?
>
> Soonbo Han told me that 1 is wrong. I also verified that with MS Word; it
> displays U+1161 upright. I tried to enter:
> U+1100 HANGUL CHOSEONG KIYEOK
> U+1161 HANGUL JUNGSEONG A
> and then made the document to vertical text flow. To enter these two code
> points into MS Word, type "1100" (without double quotes), Alt+x, "1161", and
> Alt+x.
>
> Given this, it looks like we can exclude first two options, but I'm curious
> to know what the reasons behind to exclude option 2.
>
> According to "4 Definitions" section of UAX #11 East Asian Width[1], "N" is
> defined as "Neutral (Not East Asian): All other characters. Neutral
> characters do not occur in legacy East Asian character sets" which looks to
> be wrong for this case. If you think it's not "W" for whatever reasons, it
> should be "Na" (Narrow) instead.
>

Why do you think they should be "Na"? Then, if you follow the UAX #11
recommendation, they should be rotated when they stand alone in a vertical
layout. That's *not* a desired behavior.



>
> Thanks to Soonbo Han, I now understand these code points are part of
> conjoining. I also verified this with UAX #29 Unicode Text Segmentation[2]
> that U+1161 is V (Vowel) while U+314F is a grapheme cluster.
>
> So I would add another option (as Soonbo Han sent by replying to the
> original mail):
>
> 4. It's always 2nd or 3rd code point of a grapheme cluster, and 1st code
> point (CHOSEONG) is "W", so the UA should assume the grapheme cluster is "W"
> by disregarding the EAW of 2nd/3rd code point.
>

Well, 'neutral' characters are not required to be rotated by UAX #11. It's
not explicitly mentioned in the section on the display, but if 'neutral' is
treated the same as 'ambiguous', their behavior is context dependent, which
covers what you tried to do with option #4 and more. Note that a Korean
syllable is defined as L+V+T* by the Unicode. V and T (vowels and final
consonants) can be 4th or later components of a Korean syllable.

Moreover, #4 does not cover cases where Korean vowels and final consonants
are by themselves in a vertical layout.

If Asmus does not like changing the EA assignment to Korean medial vowels
and final consonants from Neutral to Wide, it seems that UAX #11 section 5
(a subsection on the display) has to be changed to special-case Korean
vowels and final consonants.

However, we still have to resolve the inconsistency in the EA assignment of
Korean vowels and final consonants that I wrote about in the previous email.
Either all of them have to be Neutral or all of them have to be Wide.

Jungshik



>
> This is a possible interpretation of EAW, but it's not clearly written in
> the spec, so I might try to ask Unicode folks to add this description if
> this is the correct way to go. But even so, theoretically author could type
> U+1161 alone. MS Word displays it in upright in vertical text flow, so this
> option may not be as good as what MS Word does.
>
> One more possible option is:
>
> 5. U+1161 is there for historical reasons or for backward compatibility and
> nobody uses this in real world, so nobody cares whether it's "W" or "N" or
> whatever else.
>
> I don't like this option very much, but I can live with if this is what
> Korean users want.
>
>
> [1] http://www.unicode.org/reports/tr11/#Definitions
> [2] http://www.unicode.org/reports/tr29/
>
> -----Original Message-----
> From: Asmus Freytag [mailto:asmusf@ix.netcom.com]
> Sent: Wednesday, March 09, 2011 3:54 PM
> To: Koji Ishii
> Cc: public-html-ig-ko@w3.org; public-i18n-cjk@w3.org
> Subject: Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian
> Width
>
> On 3/9/2011 3:09 PM, Koji Ishii wrote:
> > Hello,
> >
> > Will you mind to help me to resolve a question in CSS3 Writing Modes
> spec?
> >
> > I'm trying to figure out which characters are displayed upright and which
> are rotated sideways in vertical text flow. I understand vertical text flow
> isn't very important for Hangul, but I hope you understand I want to write
> the correct spec in case you need it.
> >
> > Current idea is written in the spec[1], paragraphs after Figure 10. The
> basic idea is to use a combination of font information, Unicode Script
> Property[2], and Unicode East Asian Width[3].
> >
> > EAW (Unicode East Asian Width) defines character orientation like this in
> its Recommendation section[4]:
> > * Wide characters ... are not rotated (and therefore rendered upright)
> when appearing in vertical text runs.
> > * Narrow characters ... are rotated sideways, when appearing in vertical
> text.
> >
> > If I look into the data file[5], most Hangul characters are W(ide), so
> they are rendered upright in vertical text flow according to the Unicode
> definitions. I suppose this is what you expect.
> >
> > However, many of HANGUL JONGSEONG are marked as N and therefore they must
> be rotated sideways in vertical text flow if we follow this rule.
> >
> > 115F;W # HANGUL CHOSEONG FILLER
> > 1160;N # HANGUL JUNGSEONG FILLER
> > 1161;N # HANGUL JUNGSEONG A
> > 1162;N # HANGUL JUNGSEONG AE
> > 1163;N # HANGUL JUNGSEONG YA
> > ...
> >
> > I'm guessing this is NOT what you expect. Can anyone in this ML help me
> to resolve this situation? Possible answers I'm guessing are:
> >
>
>
> The characters in question are conjoining Jamos. They are supposed to form
> into syllables, which themselves are rendered upright in vertical writing.
>
> The question is whether anyone ever renders these things as themselves,
> that is when not combined into syllables and whether in that case they are
> upright when (if ever) they are vertical.
>
> Whatever the outcome, option 2 seems least desirable, because of the way
> EAW is defined.
>
>
> > 1. Unicode EAW is correct; these code points should be rotated sideways
> in vertical text flow.
> > 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> > 3. There are reasons to make these code points as "N", so EAW is correct,
> but "Narrow are rotated sideways" is incorrect.
> >
> > Which one is it, or anything else? I asked this to Soonbo Han from LG at
> CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 or 3 or
> else.
> >
> > Your support is greatly appreciated.
> >
> >
> > Regards,
> > Koji
> >
> > [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation
> > [2] http://unicode.org/reports/tr24/
> > [3] http://unicode.org/reports/tr11/
> > [4] http://unicode.org/reports/tr11/#Recommendations
> > [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
> >
> >
>
>

Received on Thursday, 10 March 2011 06:56:59 UTC