NFKC and Sem-Web applications.

Hi.
Just wondering whether a case could be made for use of NFKC rather than NFC
(á la XML) for semantic applications.

NFKC could perhaps lead to computer-identifiable matches more accurately
reflecting cases where a human would see a match, and hence to a better
match between strings and concepts, and thus perhaps to better inferences
from data.

Perhaps NFKC would have would have a place in late normalisation of data
early-normalised to NFC (at first glance re-normalising text known to be in
NFC to NFKC seems a relatively light operation, a lot of cases needed in a
full normaliser could be optimised away as unnecessary).

Jon Hanna

PGP http://www.spin.ie/jon.asc
PGP Fingerprint 707E 5E39 3BF5 533A D1DD  2083 8169 BFD7 F532 BD18
"...it has been truly said that hackers have even more words for equipment
failures than Yiddish has for obnoxious people." - jargon.txt

Received on Friday, 8 November 2002 08:07:08 UTC