[charmod-norm] Allow identity matching without decoding to scalar values (#225)

aphillips has just created a new issue for https://github.com/w3c/charmod-norm:

== Allow identity matching without decoding to scalar values ==
RDF is struggling to find a definition for "Unicode string" that fits their needs. Addison has made some suggestions, but notes that we don't provide a string definition. In addition, there is the problem that Charmod-norm doesn't provide the specific "code unit vs. code point" performance out being discussed here.

*    [RDF issue](https://github.com/w3c/rdf-concepts/issues/51)
*    [RDF pull request](https://github.com/w3c/rdf-concepts/issues/59)
*    https://lists.w3.org/Archives/Public/public-i18n-core/2023JulSep/0105.html
*    [definition discussion](https://github.com/w3c/rdf-concepts/issues/51#issuecomment-1699884210)

Particularly germane to this might be the suggestion:

> Provide normative text to allow for the efficient comparison of strings, along the lines of:
>>    A string is identical to another string if it consists of the same sequence of code points. An implementation MAY determine string equality by comparing the code units of two strings using the same Unicode character encoding form (UTF-8 or UTF-16) without decoding the string into a scalar value sequence.

This is a topic for the 2023-08-31 call.

Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/225 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 30 August 2023 21:55:15 UTC