Re: [csswg-drafts] [css-text] shaping breaks and typographic characters (#699)

@r12a:
Historically, underlining was technically unwise for Indospheric SE Asian Indic scripts - straight lines risk splitting the palm leaf the text is written on.  Using paper (or cardboard) makes a difference.  I've one example of a Tai Tham heading being underlined in the Western fashion, but it doesn't really suit the character style.  I've also got examples of mixed Thai/Tai Tham headings being underlined, which works when the Tai Tham is a single letter.  The author seems to have abandoned underlining the Tai Tham part when he got to the letters which are stacks starting with HIGH HA.

I'd be hesitant to say that Tai Tham joins words within a stack - the joins are either like English contractions as in "So've I" or combinations of alliterating words which arguably form a single lexeme.

There's the interesting case of Sanskrit, and to a lesser extent Pali, where words begin within an indecomposable Unicode *character*, never mind stack.  However, note the similar behaviour with quadrates in Egyptian hieroglyphics, which tend not to respect word boundaries.  When a word is to be emphasised by a cartouche, the quadrate structure suddenly respects the word boundaries so that the quadrate and cartouche boundaries do not conflict.

Historically, the Devanagari half-forms are weird.  Historically, the 'invisible' virama belongs rather with the following character, as with the Tibetan subscript consonants, and is manifest in the scripts for which virama+consonant has an alternative form, generally with a distinct usage pattern, which is encoded as an indivisible subscript consonant.  When C1 half-forms are not used, I do not believe the formal grapheme cluster boundary corresponds to anything in the Unicode-unaware user's mind.

-- 
GitHub Notification of comment by Richard57
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/699#issuecomment-449586251 using your GitHub account

Received on Saturday, 22 December 2018 17:33:24 UTC