[string-search] IVS in string searching (#21)

xfq has just created a new issue for https://github.com/w3c/string-search:

== IVS in string searching ==
In an IVS, when VS and the previous code point are displayed and used, they should be treated as one "character" when displayed. When doing string searching operations, the first code point should be the processing baseline. For example, if two Han characters appear together, such as "龍VS天" (U+9F8D U+E0100 U+5929), searching "龍天" should match the relevant content. If necessary, the browser can also include some preference to config "precise" string searching operations.

Some IVS examples:

* https://xfq.github.io/testing/ivs/ivs.xml (UTF-8)
* https://xfq.github.io/testing/ivs/ivs4gb.xml (GB 18030)

Currently, Blink and WebKit supports this, but Gecko does not.

Maybe we can add it to https://w3c.github.io/string-search/#orthoVariation ?

Please view or discuss this issue at https://github.com/w3c/string-search/issues/21 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Sunday, 29 October 2023 07:47:59 UTC