W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2012

Re: UTF-8 NFC vs NFD compression, French sample (was: Updated Binary Optimized Header Encoding Draft)

From: Frédéric Kayser <f.kayser@free.fr>
Date: Wed, 21 Nov 2012 15:55:08 +0100 (CET)
To: HTTP Working Group <ietf-http-wg@w3.org>
Cc: Nico Williams <nico@cryptonector.com>, Martin J. Dürst <duerst@it.aoyama.ac.jp>
Message-ID: <1296436146.47352333.1353509708430.JavaMail.root@zimbra71-e12.priv.proxad.net>
Effectively, I have noticed some odd behaviour, for instance search of words containing diacritics does not work if the text is in NFD since when typed the search string is usually in NFC (tested in many text editors and web browsers).

Regards
-- 
Frédéric Kayser

----- Mail original -----
De: "Nico Williams" <nico@cryptonector.com>
À: "Frédéric Kayser" <f.kayser@free.fr>
Cc: "HTTP Working Group" <ietf-http-wg@w3.org>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Envoyé: Mardi 20 Novembre 2012 22:04:56
Objet: Re: UTF-8 NFC vs NFD compression, French sample (was: Updated Binary Optimized Header Encoding Draft)



I don't recommend using NFD. My rain for this is simple: the majority of input modes produce NFC, thus too many libraries were coded to deal primarily or only with pre-composed sequences -- throwing NFD at then can cause trouble, so I'd rather avoid it. 

(My experience here comes from HFS+'s conversion of file names to NFD on create.) 

Nico 
-- 
Received on Wednesday, 21 November 2012 14:55:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 21 November 2012 14:55:48 GMT