W3C home > Mailing lists > Public > public-html-ig-ko@w3.org > March 2011

Re: HANGUL JONGSEONG, vertical text flow, and Unicode East Asian Width

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Mon, 14 Mar 2011 11:30:55 -0700
Message-ID: <4D7E5EDF.10600@ix.netcom.com>
To: Richard Ishida <ishida@w3.org>
CC: Koji Ishii <kojiishi@gluesoft.co.jp>, public-html-ig-ko@w3.org, public-i18n-cjk@w3.org
Richard,

very sensible way to slice this.

On the Unicode side, we can look separately at whether it makes sense to 
address the kinds of inconsistencies that Koji and Jungshik have 
identified (I haven't had time to study these in depth).

Unicode's EAW was designed to deal with *legacy* character sets, which 
usually don't contain conjoining Jamos. And the layout prescriptions for 
it were primarily intended to deal with things like wide ASCII, not to 
serve as a comprehensive description of all Asian character layout. 
Because of this, it's not clear whether at the end of the day, even a 
cleaned up EAW property would align fully with your needs.

Anyway, for now it would make sense to enumerate their layout behavior 
in whatever fashion works for this purpose.You can then check later, 
after EAW has been revised, whether it needs to be maintained, or 
whether it will then be redundant and drop out.

A./

On 3/14/2011 11:00 AM, Richard Ishida wrote:
> Coming at this from the use case requirements rather than trying to 
> work it out from the implementation details:
>
> Take the word 한글 (hangul).  I can obviously write this
>
> 한  D55C  [Hangul Syllables]
> 글  AE00  [Hangul Syllables]
>
> and I'd expect this to show two syllabic glyphs vertically arranged.
>
> However, I could equally well, in memory, have the following:
>
> ᄒ  1112  HANGUL CHOSEONG HIEUH
> ᅡ  1161  HANGUL JUNGSEONG A
> ᆫ  11AB  HANGUL JONGSEONG NIEUN
> ᄀ  1100  HANGUL CHOSEONG KIYEOK
> ᅮ  116E  HANGUL JUNGSEONG U
> ᆯ  11AF  HANGUL JONGSEONG RIEUL
>
> Which should, when displayed, look exactly the same. My assumption is 
> that any non-separated sequence of characters constituting a syllable 
> or part of a syllable from the Unicode hangul jamo block would combine 
> and therefore be displayed without rotation.
>
> Since the font should combine these characters into two 
> two-dimensional syllabic arrangements, I don't know whether it's 
> necessary to specify placement in terms of grapheme clusters, or to 
> just assume that the font will take care of this anyway.
>
> If I wanted to list the jamo involved in such a word, for say an 
> educational or linguistic text, I'd actually have to do something like 
> this:
>
> ᄒ  1112  HANGUL CHOSEONG HIEUH
> ​  200B  ZERO WIDTH SPACE
> ᅡ  1161  HANGUL JUNGSEONG A
> ​  200B  ZERO WIDTH SPACE
> ᆫ  11AB  HANGUL JONGSEONG NIEUN
> ​  200B  ZERO WIDTH SPACE
> ᄀ  1100  HANGUL CHOSEONG KIYEOK
> ​  200B  ZERO WIDTH SPACE
> ᅮ  116E  HANGUL JUNGSEONG U
> ​  200B  ZERO WIDTH SPACE
> ᆯ  11AF  HANGUL JONGSEONG RIEUL
>
> to stop them combining visually.
>
> The important question, to my mind, is whether the characters are 
> rotated when they occur either individually or separated as above.  My 
> guess is no. (There is also the question: would you ever find this in 
> vertical text, but I assume that we must assume that someone might 
> want to do so at some time.)
>
> I assume that we should care less about the hangul compatibility 
> characters since they shouldn't be used anyway. But since, if they are 
> used they do not lead to this combining behaviour, it makes sense that 
> they are non-rotated.
>
>
> RI
>
>
>
> On 09/03/2011 23:09, Koji Ishii wrote:
>> Hello,
>>
>> Will you mind to help me to resolve a question in CSS3 Writing Modes 
>> spec?
>>
>> I'm trying to figure out which characters are displayed upright and 
>> which are rotated sideways in vertical text flow. I understand 
>> vertical text flow isn't very important for Hangul, but I hope you 
>> understand I want to write the correct spec in case you need it.
>>
>> Current idea is written in the spec[1], paragraphs after Figure 10. 
>> The basic idea is to use a combination of font information, Unicode 
>> Script Property[2], and Unicode East Asian Width[3].
>>
>> EAW (Unicode East Asian Width) defines character orientation like 
>> this in its Recommendation section[4]:
>> * Wide characters ... are not rotated (and therefore rendered 
>> upright) when appearing in vertical text runs.
>> * Narrow characters ... are rotated sideways, when appearing in 
>> vertical text.
>>
>> If I look into the data file[5], most Hangul characters are W(ide), 
>> so they are rendered upright in vertical text flow according to the 
>> Unicode definitions. I suppose this is what you expect.
>>
>> However, many of HANGUL JONGSEONG are marked as N and therefore they 
>> must be rotated sideways in vertical text flow if we follow this rule.
>>
>> 115F;W # HANGUL CHOSEONG FILLER
>> 1160;N # HANGUL JUNGSEONG FILLER
>> 1161;N # HANGUL JUNGSEONG A
>> 1162;N # HANGUL JUNGSEONG AE
>> 1163;N # HANGUL JUNGSEONG YA
>> ...
>>
>> I'm guessing this is NOT what you expect. Can anyone in this ML help 
>> me to resolve this situation? Possible answers I'm guessing are:
>>
>> 1. Unicode EAW is correct; these code points should be rotated 
>> sideways in vertical text flow.
>> 2. Unicode EAW is incorrect; these code points should be "W", not "N".
>> 3. There are reasons to make these code points as "N", so EAW is 
>> correct, but "Narrow are rotated sideways" is incorrect.
>>
>> Which one is it, or anything else? I asked this to Soonbo Han from LG 
>> at CSSWG, he thinks the answer is not 1, but he wasn't sure if it's 2 
>> or 3 or else.
>>
>> Your support is greatly appreciated.
>>
>>
>> Regards,
>> Koji
>>
>> [1] http://dev.w3.org/csswg/css3-writing-modes/#text-orientation
>> [2] http://unicode.org/reports/tr24/
>> [3] http://unicode.org/reports/tr11/
>> [4] http://unicode.org/reports/tr11/#Recommendations
>> [5] http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt
>>
>>
>
Received on Monday, 14 March 2011 18:32:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 14 March 2011 18:32:48 GMT