Re: [jlreq-d] Proposal: making ruby breakable (#19)

I have some concerns about this, mostly having to do with the complexities of layout and how to adjust the spacing when a given ruby run is broken across lines.

1) Are we to assume the author is providing valid break points for both ruby and base text? Grouped ruby runs, where the entire base text run is annotated by an entire ruby run (and not word-by-word, where there is an established word break for both), is not in and of itself easily breakable. Yes, Japanese text can generally be broken anywhere, but this this not true if the intent of ruby annotation is to give the correct reading for the word on a character-by-character basis. Witness the convention of only providing ruby for the first instance of an unusual Kanji—this implies the need to associate a given ruby sub-string with its corresponding base text sub-string. For such cases Juku-go ruby is the appropriate usage, providing for sub-string-sensitive line breaks. Extending this understanding out to include Group ruby, one naturally assumes the same rules apply and the sub-string becomes the entire base run and entire ruby run (unbreakable, since the ruby is intended to annotate the entire base string and not a sub-run. Put another way, any sub-run should stand alone as a valid reading for that sub-run even outside the context of the group ruby word.
If, on the other hand, this proposal is comfortable throwing that away and assuming the reader will be fine with reading an arbitrary break in the ruby, an arbitrary but legal break in the base text, and be able to continue the ruby reading and parse it as a single word broken across lines, then we get to my second concern...

Spacing rules for long ruby dictate the base text should be tracked out so the long ruby doesn't overhang adjacent words or punctuation (again because the correspondence of the ruby to the base text is assumed to be very important). Changing the break point of a group ruby could mean that the unbroken width ≠ the broken width of the sub-run due to where the break occurs in the ruby and the base text. So, the engine must therefore compute both unbroken and broken widths for all possible break points, which could be onerous. but, even if it were cheap, the third concern is:

What are the rules for spacing and overhang for broken Group ruby? Jukugo ruby has distinct and known rules, but broken Group ruby is necessarily different—the sub-run is not a legal word, not legally annotated (meaning it is not a correct annotation outside the context of the group), so should it be spaced and overhang or not as if it were (just like Jukugo), or are there different rules that aid the reader in parsing such broken words?

-- 
GitHub Notification of comment by macnmm
Please view or discuss this issue at https://github.com/w3c/jlreq-d/issues/19#issuecomment-1222761395 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Monday, 22 August 2022 18:31:34 UTC