W3C home > Mailing lists > Public > public-html-ig-ko@w3.org > March 2011

Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

From: Richard Ishida <ishida@w3.org>
Date: Mon, 14 Mar 2011 18:00:03 +0000
Message-ID: <4D7E57A3.5020904@w3.org>
To: Koji Ishii <kojiishi@gluesoft.co.jp>
CC: "HTML Korean Interest Group (public-html-ig-ko@w3.org)" <public-html-ig-ko@w3.org>, "CJK discussion (public-i18n-cjk@w3.org)" <public-i18n-cjk@w3.org>
Coming at this from the use case requirements rather than trying to work 
it out from the implementation details:

Take the word 한글 (hangul).  I can obviously write this

한  D55C  [Hangul Syllables]
글  AE00  [Hangul Syllables]

and I'd expect this to show two syllabic glyphs vertically arranged.

However, I could equally well, in memory, have the following:

ᄒ  1112  HANGUL CHOSEONG HIEUH
ᅡ  1161  HANGUL JUNGSEONG A
ᆫ  11AB  HANGUL JONGSEONG NIEUN
ᄀ  1100  HANGUL CHOSEONG KIYEOK
ᅮ  116E  HANGUL JUNGSEONG U
ᆯ  11AF  HANGUL JONGSEONG RIEUL

Which should, when displayed, look exactly the same. My assumption is 
that any non-separated sequence of characters constituting a syllable or 
part of a syllable from the Unicode hangul jamo block would combine and 
therefore be displayed without rotation.

Since the font should combine these characters into two two-dimensional 
syllabic arrangements, I don't know whether it's necessary to specify 
placement in terms of grapheme clusters, or to just assume that the font 
will take care of this anyway.

If I wanted to list the jamo involved in such a word, for say an 
educational or linguistic text, I'd actually have to do something like this:

ᄒ  1112  HANGUL CHOSEONG HIEUH
​  200B  ZERO WIDTH SPACE
ᅡ  1161  HANGUL JUNGSEONG A
​  200B  ZERO WIDTH SPACE
ᆫ  11AB  HANGUL JONGSEONG NIEUN
​  200B  ZERO WIDTH SPACE
ᄀ  1100  HANGUL CHOSEONG KIYEOK
​  200B  ZERO WIDTH SPACE
ᅮ  116E  HANGUL JUNGSEONG U
​  200B  ZERO WIDTH SPACE
ᆯ  11AF  HANGUL JONGSEONG RIEUL

to stop them combining visually.

The important question, to my mind, is whether the characters are 
rotated when they occur either individually or separated as above.  My 
guess is no. (There is also the question: would you ever find this in 
vertical text, but I assume that we must assume that someone might want 
to do so at some time.)

I assume that we should care less about the hangul compatibility 
characters since they shouldn't be used anyway. But since, if they are 
used they do not lead to this combining behaviour, it makes sense that 
they are non-rotated.


RI



On 09/03/2011 23:09, Koji Ishii wrote:
> Hello,
>
> Will you mind to help me to resolve a question in CSS3 Writing Modes spec?
>
> I'm trying to figure out which characters are displayed upright and which are rotated sideways in vertical text flow. I understand vertical text flow isn't very important for Hangul, but I hope you understand I want to write the correct spec in case you need it.
>
> Current idea is written in the spec[1], paragraphs after Figure 10. The basic idea is to use a combination of font information, Unicode Script Property[2], and Unicode East Asian Width[3].
>
> EAW (Unicode East Asian Width) defines character orientation like this in its Recommendation section[4]:
> * Wide characters ... are not rotated (and therefore rendered upright) when appearing in vertical text runs.
> * Narrow characters ... are rotated sideways, when appearing in vertical text.
>
> If I look into the data file[5], most Hangul characters are W(ide), so they are rendered upright in vertical text flow according to the Unicode definitions. I suppose this is what you expect.
>
> However, many of HANGUL JONGSEONG are marked as N and therefore they must be rotated sideways in vertical text flow if we follow this rule.
>
> 115F;W # HANGUL CHOSEONG FILLER
> 1160;N # HANGUL JUNGSEONG FILLER
> 1161;N # HANGUL JUNGSEONG A
> 1162;N # HANGUL JUNGSEONG AE
> 1163;N # HANGUL JUNGSEONG YA
> ...
>
> I'm guessing this is NOT what you expect. Can anyone in this ML help me to resolve this situation? Possible answers I'm guessing are:
>
> 1. Unicode EAW is correct; these code points should be rotated sideways in vertical text flow.
> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
> 3. There are reasons to make these code points as "N", so EAW is correct, but "Narrow are rotated sideways" is incorrect.
>
> Which one is it, or anything else? I asked this to Soonbo Han from LG at CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 or 3 or else.
>
> Your support is greatly appreciated.
>
>
> Regards,
> Koji
>
> [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation
> [2] http://unicode.org/reports/tr24/
> [3] http://unicode.org/reports/tr11/
> [4] http://unicode.org/reports/tr11/#Recommendations
> [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
>
>

-- 
Richard Ishida
Internationalization Activity Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/
Received on Monday, 14 March 2011 18:01:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 14 March 2011 18:01:35 GMT