Re: [charmod-norm] Normalisation for Case-Insensitive Comparison

I found it on the last page here:

http://www.unicode.org/versions/Unicode11.0.0/ch03.pdf

D145  A string X is a canonical caseless match for a string Y if and only if:
NFD(toCasefold(NFD( X ))) = NFD(toCasefold(NFD( Y )))

The invocations of canonical decomposition (NFD normalization) before case folding in
D145 are to catch very infrequent edge cases. Normalization is not required before case
folding, except for the character U+0345 combining  greek  ypogegrammeni and any
characters that have it as part of their canonical decomposition, such as U+1FC3 
greeksmall  letter  eta  with  ypogegrammeni.  In  practice,  optimized  versions  of  canonical
caseless  matching  can  catch  these  special  cases,  thereby  avoiding  an  extra  normalization
step for each comparison.

-- 
GitHub Notification of comment by EricSharkey
Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/172#issuecomment-396238597 using your GitHub account

Received on Monday, 11 June 2018 13:13:25 UTC