- From: Michael A. Puls II <shadow2531@gmail.com>
- Date: Sat, 23 Jun 2007 20:41:05 -0400
> On Sat, 11 Mar 2006, Henri Sivonen wrote: > > The encoding labels with LE or BE in them mean BOMless variants where > > the encoding label on the transfer protocol level gives the endianness. > > See http://www.ietf.org/rfc/rfc2781.txt When the spec refers to UTF-16 > > with BOM in a particular endianness, I think the spec should use > > "big-endian UTF-16" and "little-endian UTF-16". > > > > Since declaring endianness on the transfer protocol level has no benefit > > over using the BOM when the label is right and there's a chance to get > > the label wrong, the encoding labels with explicit endianness are > > harmful for interchange. In my opinion, the spec should avoid giving > > authors any bad ideas by reinforcing these labels by repetition. FWIW, after reading the labeling part of the RFC again and adding your suggestion, I came up with this: big-endian UTF-16 = The big-endian encoding of UTF-16 with the BOM FEFF little-endian UTF-16 = The little-endian encoding of UTF-16 with the BOM FFFE UTF-16BE = The big-endian encoding of UTF-16 without the BOM UTF-16LE = The little-endian encoding of UTF-16 without the BOM UTF-16 = big-endian UTF-16 or little-endian UTF-16 or fallback to UTF-16BE -- Michael
Received on Saturday, 23 June 2007 17:41:05 UTC