W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2010

[Bug 10890] i18n comment : Allow utf-16 meta encoding declarations

From: <bugzilla@jessica.w3.org>
Date: Thu, 07 Oct 2010 18:22:07 +0000
To: public-i18n-core@w3.org
Message-Id: <E1P3v6V-0003Qi-8A@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10890

--- Comment #4 from I18n Core WG <public-i18n-core@w3.org> 2010-10-07 18:22:06 UTC ---
(In reply to comment #3)
> My point is that you actually can't inspect this visually. If you open a file
> in a text editor and you see <meta charset="utf-16"> how do you know whether:
>  1) The file had a UTF-16 BOM and was encoded in UTF-16 and the meta has no
> effect.
> OR
>  2) The file didn't have an UTF-16 BOM and was encoded in an ASCII-superset
> encoding and the meta would make a UA treat the file as being UTF-8-encoded.

You can't be sure, no.  But then you can't be sure that any encoding
declaration is correct - it's not a reason not disallow it, since that makes
life more difficult for people who do do the right thing. 

> I think we should try to change things so that people will use UTF-8 and not
> continue to use UTF-16.

I agree, but we can't preclude it because it's not forbidden by the spec.

> This will not be a problem if authors always use UTF-8 and, as result, don't
> use UTF-16.

I agree, but some people will still use UTF-16. Having said that, we're talking
about less than 0.01% of web pages here according to a recent Google survey of
6.5 billion pages (against over 50% using UTF-8, and almost 70% using either
UTF-8 or ASCII). My guess is that those few people using UTF-16 are technically
aware enough to pay attention to things like this. I don't buy that incorrect
labelling is such a serious problem.  

On the other hand, if we don't allow charset=utf-16, then every tutorial, every
primer, every book, every checker, etc, has to make a detour to explain how
UTF-16 is different from anything else when telling people how to use encoding
related markup, which is annoying for both the writer and the reader given that
we don't want people to use it anyway.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
You reported the bug.
Received on Thursday, 7 October 2010 18:22:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 7 October 2010 18:22:10 GMT