[string-search] Include Unihan variants in fuzzy search (#31)

nisbet-hubbard has just created a new issue for https://github.com/w3c/string-search:

== Include Unihan variants in fuzzy search ==
A semantically-based fuzzy search mode for ideographs as described in ยง2.1.6 (#24) should naturally also cover similar variants that are present in Unicode per se due to application of the now deprecated source separation rule. These variants are discussed in Annex #38: https://www.unicode.org/reports/tr38/#N10211.

This can be implemented with the help of Unihan_Variants.txt: https://www.unicode.org/Public/17.0.0/ucd/Unihan.zip

The relevant subsets of Unihan variants are:

- kSemanticVariant
- kSimplifiedVariant
- kTraditionalVariant
- kZVariant

Please view or discuss this issue at https://github.com/w3c/string-search/issues/31 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Sunday, 14 September 2025 04:05:14 UTC