- From: Phillips, Addison <addison@lab126.com>
- Date: Wed, 21 Nov 2012 15:40:02 -0800
- To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, John Cowan <cowan@mercury.ccil.org>
- CC: Anne van Kesteren <annevk@annevk.nl>, "www-international@w3.org" <www-international@w3.org>
> If the above is an accurate reflection of what Unicode says, then it doesn’t > sound as if it is considered as very safe to let a leading FF FE/FE FF for anything > but the BOM - not even when using UTF-16LE/UTF-16BE. The use of U+FEFF as anything other than a Unicode signature is already deprecated. In fact, Unicode created the Zero Width Joiner character to replace BOM's other "identity" of "zero width non-breaking space". To wit, in the Standard, section 16.2 says: -- Zero Width No-Break Space. In addition to its primary meaning of byte order mark (see “Byte Order Mark” in Section 16.8, Specials), the code point U+FEFF possesses the semantics of zero width no-break space, which matches that of word joiner. Until Unicode 3.2, U+FEFF was the only code point with word joining semantics, but because it is more commonly used as byte order mark, the use of U+2060 word joiner to indicate word joining is strongly preferred for any new text. Implementations should continue to support the word joining semantics of U+FEFF for backward compatibility. -- Addison Addison Phillips Globalization Architect (Lab126) Chair (W3C I18N WG) Internationalization is not a feature. It is an architecture.
Received on Wednesday, 21 November 2012 23:40:51 UTC