Re: [jlreq-d] Proposal: making ruby breakable (#19)

Thank you Nat for sharing your concerns.

>Are we to assume the author is providing valid break points for both ruby and base text?

no, providing a subdivision is optional. This is because providing subdivisions right now is labour intensive. I am hoping that in the future we have text engines that are aware of the content of the text and provide sophisticated breaking points automatically.

>…if the intent of ruby annotation is to give the correct reading for the word on a character-by-character basis. Witness the convention of only providing ruby for the first instance of an unusual Kanji—this implies the need to associate a given ruby sub-string with its corresponding base text sub-string. For such cases Juku-go ruby is the appropriate usage, providing for sub-string-sensitive line breaks.

Today, showing reading for the word on a character-by-character basis is not possible unless you make each character a mono-ruby. Jukugo ruby is a superior method, but it is not supported by most implementations. I am hoping that the proposal with expanded use of subdivisions makes it more appealing to the developer community.

>Extending this understanding out to include Group ruby, one naturally assumes the same rules apply and the sub-string becomes the entire base run and entire ruby run (unbreakable, since the ruby is intended to annotate the entire base string and not a sub-run). Put another way, any sub-run should stand alone as a valid reading for that sub-run even outside the context of the group ruby word.

Right, the subdivision is to be used as you mentioned. If the specific reading can be applied outside the context depends because, for example, there are cases where a Kanji is read in a certain way only within a particular jukugo. 'Rendaku' is another such case.

>If, on the other hand, this proposal is comfortable throwing that away and assuming the reader will be fine with reading an arbitrary break in the ruby, an arbitrary but legal break in the base text, and be able to continue the ruby reading and parse it as a single word broken across lines,

Right, it is a compromise like I mentioned in the proposal. A compromise between a big justification amount with large spaces between characters, and strange division of rubies when you have a long ruby hanging at the end of a line.

At the same time, the proposal provides a way to restrict breaks to very long rubies only or to entirely prohibit breaks. By default rubies are unbreakable when it is equal to or under two fullwidth character length.

>the engine must therefore compute both unbroken and broken widths for all possible break points, which could be onerous.

not sure. If you know the breaking opportunities for the base text (which you already know) and ruby text (new), you can relatively easily figure out the optimal breaking point.

>What are the rules for spacing and overhang for broken Group ruby? Jukugo ruby has distinct and known rules, but broken Group ruby is necessarily different—the sub-run is not a legal word, not legally annotated (meaning it is not a correct annotation outside the context of the group), so should it be spaced and overhang, or not, as if it were (just like Jukugo), or are there different rules that aid the reader in parsing such broken words?

Unless a break happens, all subdivisions are put together and it behave like one monolithic ruby. I will clarify this point in the proposal.

When a ruby (e.g. aka group ruby) is broken into two lines, I assume each part will behave like independent rubies, one at the end of a line and the other at the beginning of a line. This might be an area where we can come up with a better layout recommendation to improve the readability of broken rubies. Bin-sensei mentioned that are are books with such examples. We could extract something from these books and/or discuss with people who are working with such a layout.

-- 
GitHub Notification of comment by kidayasuo
Please view or discuss this issue at https://github.com/w3c/jlreq-d/issues/19#issuecomment-1223469105 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Tuesday, 23 August 2022 02:59:42 UTC