Re: [jlreq] [META] Reorganize character classes and its adoption of Unicode based definition (#240)

Shimono-san, thank you very much for bringing this up and making a summary.

Expanding JLReq to Unicode, or in more generic sense making JLReq interoperable with Unicode, I think is the biggest challenge in bringing JLReq to the next level (or the next major version). It is about making it future compatible.

It is rather a complex task as Shimono-san outlined. JLReq's character class is a combination of static property of characters and the context, where the character is used. We need to separate the context from the static property. It is a major architectural conversion which requires many rewrites.

and then we would re-define JLReq character classes using Unicode character properties. There might be cases where the current Unicode property is not sufficient to differentiate necessary behaviours.

In the process we might find cases where JLReq can be simplified (especially because the next major version will be devoted to digital text). Also in the process I believe there will be cases where we need clearer ideas on how each character, especially symbols, are to be used. It will lead to some guideline-ish description in the document (for this we need to be careful because we are not in the position of defining orthography of the language)


-- 
GitHub Notification of comment by kidayasuo
Please view or discuss this issue at https://github.com/w3c/jlreq/issues/240#issuecomment-705388277 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 8 October 2020 07:32:21 UTC