W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

RE: byte order mark article

From: Phillips, Addison <addison@lab126.com>
Date: Wed, 21 Nov 2012 15:46:19 -0800
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, John Cowan <cowan@mercury.ccil.org>
CC: Anne van Kesteren <annevk@annevk.nl>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <131F80DEA635F044946897AFDA9AC34773A8999818@EX-SEA31-D.ant.amazon.com>
... except that I meant to say "Word Joiner" and not ZWJ. Doh.

> -----Original Message-----
> From: Phillips, Addison
> Sent: Wednesday, November 21, 2012 3:40 PM
> To: Leif Halvard Silli; John Cowan
> Cc: Anne van Kesteren; www-international@w3.org
> Subject: RE: byte order mark article
> 
> > If the above is an accurate reflection of what Unicode says, then it
> > doesn’t sound as if it is considered as very safe to let a leading FF
> > FE/FE FF for anything but the BOM - not even when using UTF-16LE/UTF-16BE.
> 
> The use of U+FEFF as anything other than a Unicode signature is already
> deprecated. In fact, Unicode created the Zero Width Joiner character to
> replace BOM's other "identity" of "zero width non-breaking space". To wit, in
> the Standard, section 16.2 says:
> 
> --
> Zero Width No-Break Space. In addition to its primary meaning of byte order
> mark (see “Byte Order Mark” in Section 16.8, Specials), the code point U+FEFF
> possesses the semantics of zero width no-break space, which matches that of
> word joiner. Until Unicode 3.2,
> U+FEFF was the only code point with word joining semantics, but because
> U+it is more commonly
> used as byte order mark, the use of U+2060 word joiner to indicate word
> joining is strongly preferred for any new text. Implementations should continue
> to support the word joining semantics of U+FEFF for backward compatibility.
> --
> 
> Addison
> 
> Addison Phillips
> Globalization Architect (Lab126)
> Chair (W3C I18N WG)
> 
> Internationalization is not a feature.
> It is an architecture.
> 

Received on Wednesday, 21 November 2012 23:46:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 21 November 2012 23:46:52 GMT