RE: UTF-16, UTF-16BE and UTF-16LE in HTML5

> From: Henri Sivonen [mailto:hsivonen@iki.fi]
> Sent: 27 July 2010 11:33
...
> > [2] i18n folks have long advised that you should always include a
> > visible
> > indication of the encoding in a document, HTML or XML, even if you
> > don't
> > strictly need to, because it can be very useful for developers,
> > testers, or
> > translation production managers who want to visually check the
> > encoding of a
> > document.
> 
> That's a bad rationale. It's a *very* bad idea to check the encoding by
> reading a string that doesn't participate in encoding detection at all, since the
> string may be wrong.

Well any encoding declaration may be wrong - participation in the encoding detection doesn't mean that the encoding of the document will actually be what the declaration says. So I don't think it makes much difference.  On the other hand, since actually getting your document into a utf-16 encoding is a little more complicated than using other encodings, it may be more often right - in which case it is extremely useful for people who visually inspect the document, given that they can't see the BOM and may otherwise assume that the encoding is not utf-16.

RI

Received on Tuesday, 27 July 2010 12:14:07 UTC