Re: [i18n-activity] Use of CGJ

I submitted a long individual feedback. Here's the part related to this issue:

----

# 4. Consequences of the Algorithm: Semantics

With UAOA applied on text during rendering, some strings collapse into a single sequence. Basically, there are plenty of strings X and Y, where toNFC(X) ≠ toNFC(Y), but UAOA(toNFC(X)) = UAOA(toNFC(Y)).

Basically, this is changing the semantics of existing text encoded in Unicode, since the rendering will be different afterwards. The document is not clear about this semantic change and only claims to “correcting” all the problems.

The proposal is suggesting to use CGJ to preserve the old semantics when needed. The document needs to be more clear about how to preserve the semantics. In fact, there should be a clear algorithm to convert a string X to preserve the semantics when changing the (rendering) interpretation, since for a couple of decades users have been storing text in the current semantics of the encoding, which has been the only recommended way to do so by Unicode.


-- 
GitHub Notification of comment by behnam
Please view or discuss this issue at https://github.com/w3c/i18n-activity/issues/498#issuecomment-336578221 using your GitHub account

Received on Friday, 13 October 2017 22:02:47 UTC