Re: [CSS3 Text] Tibetan justification

This mail is forwarded on behalf of Chris Fynn.

Felix

> fantasai wrote:
> 
>> C J Fynn wrote:
>>>
>>> Hi
>>>
>>> The working draft of the CSS3 Text Module
>>> <http://fantasai.inkedblade.net/style/specs/css3-text/scratchpad> says:
>>>
>>> "tibetan
>>>     Justification primarily stretches spaces after shad if the line contains any and/or pads the end of the line with tsek marks if the line already ends in one."
>>>
>>> 1. "spaces after shad" needs to include spaces following the letters KA U+0F40 and GA U+0F42 (with or without combining vowels) since the shad is not written after these two characters (due to the long descenders on the right side of their glyphs).
>>
>> Thanks for pointing that out, I'd forgotten to include that exception.
>>
>>> 2. Traditionally manuscript and xylograph printed Tibetan texts were "justified" by padding lines with multiple tsek (U+0F0B) marks. This was necessary as calculating the amount of extra space needed for padding lines was impractical when writing text by hand or carving woodblocks.
>>>
>>> Today this practice is insisted on by one or two pedantic westerners who have seen it in old texts and think therefore it should be maintained.
>>
>> Ok. I've seen this in a good handful of newly-printed books as well as a
>> Tibetan newspaper in the National Library of China, so whether or not it's
>> the fault of a couple pendantic westerners, it is still in use. However,
>> you are not the only one who sent in a comment suggesting that the value
>> be dropped. After talking with Paul Nelson, we've decided to publish the
>> next official draft of CSS3 Text with the value defined, but note that it
>> will most likely be dropped in the next revision. If there are no objections
>> to that, we'll remove it.
> 
> Yes it the padding with tseks is sometimes still used because it is currently often a real hassle to achieve justification any other way with lines of Tibetan text containing no spaces. Up to now most word processors allow you to adjust character (glyph) spacing - but the user often has to do this line by line, with different values for each line, to achieve justification in Tibetan text. Much easier to move the cursor down the right hand side of the page and add tseks where needed to pad the line - so that's what many people do.
> 
> BTW many of the modern Tibetan books published in China were until recently set using a very elaborate typewriter to produce camera ready copy for the body text. There are also a couple of computer based publishing systems still in use by publishers there which in some ways are not very sophisticated since they were primarily designed for Chinese which usually has mono-width glyphs. Things are of course slowly improving.
> 
> 
>>> However in my experience native Tibetan and Bhutanese users invariably prefer normally justified text when setting Tibetan on computers. Since space characters are infrequent in Tibetan (and sometimes do not occur even in a long line of text) this is best achieved by both stretching spaces and by slightly increasing the width of the glyph for tsek characters (which follow every syllable).
>>
>> Yes, this is the justification I saw in the rest of the Tibetan books I found.
>> There was a slight bit of extra space after every tsek mark in a justified
>> line. However, as I noted in the word-spacing section
>>   http://fantasai.inkedblade.net/style/specs/css3-text/scratchpad#word-spacing
>> I'm not sure if that extra space should ideally be after the tsek mark or
>> distributed on both sides of it. If you've got some advice on that, too, I'd appreciate it. 
> 
> Probably both sides - at least with normal Tibetan (dbu can) script. One caveat is that with various cursive forms of Tibetan script a superscribed vowel over a base stack may connect with a following tsek (when writing the two are drawn with a single stroke) - in these cases distributing the space on both sides of the tsek might cause the vowel mark to shift slightly to the right. Mind you the increment is likely to be so small it would be unnoticeable.
> 
>> (The 'inter-word' keyword, as currently defined, would invoke this behavior.)
> 
> Should be OK.
> 
>>> [It should be noted that these tsek characters (U+0F0B) also provide the
>>> primarily line break opportunity in Tibetan and Dzongkha text.]
>>
>> [Noted, although CSS3 Text doesn't cover line breaking rules; UAX14 does.]
> 
> OK. The finer points of Tibetan line braking rules are fairly complex. However the main things are: tsek characters (U+0F0B) provide primary break opportunity;  lines of Tibetan text normally don't break at space; and there is a break opportunity after U+0F0D or U+0F0E when either of these characters follows a space and precedes a base consonant.
> 
> If that's what UAX14 suggests for a basic implementation of Tibetan line breaking, it should be sufficient for everyday word processing needs & web pages. Only dedicated page layout software might need to handle the more esoteric rules.
> 
>> Thank you for your comments.
> 
> Thank _you_ for working on handling Tibetan script in CSS3.
> 
> - Chris
> 
>> ~fantasai

Received on Friday, 12 January 2007 07:23:52 UTC