- From: Glenn Adams <glenn@skynav.com>
- Date: Mon, 5 Dec 2011 11:00:24 -0700
- To: Glenn Maynard <glenn@zewt.org>
- Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, WebApps WG <public-webapps@w3.org>
- Message-ID: <CACQ=j+fthcE-8cyA56V--A5Gd_kSzJAJ0u+QXvupyHXkdGaqRw@mail.gmail.com>
On Mon, Dec 5, 2011 at 9:32 AM, Glenn Maynard <glenn@zewt.org> wrote: > On Mon, Dec 5, 2011 at 11:12 AM, Glenn Adams <glenn@skynav.com> wrote: > >> In the example you give, there is consistency between the content >> metadata (charset param) and the content itself (as determined by >> sniffing). So why would both the metadata and content be ignored? > > > Because in the real world, UTF-32 isn't a transfer encoding. Browsers > shouldn't have to waste time supporting it, and if someone accidentally > creates content in that encoding somehow, it should be immediately clear > that something is wrong. > > It would take a major disconnect from reality to insist that browsers > support UTF-32. > > > In any case, what is suggested below would be a direct violation of [2] > as well. > > > > [2] http://www.w3.org/TR/charmod/#C030 > > No, it wouldn't. That doesn't say that UTF-32 must be recognized. You misread me. I am not saying or supporting that UTF-32 must be recognized. I am saying that MIS-recognizing UTF-32 as UTF-16 violates [2]. If a browser doesn't support UTF-32 as an incoming interchange format, then it should treat it as any other character encoding it does not recognize. It must not pretend it is another encoding. Note that it would be acceptable to transcode an incoming UTF-32 serialized resource into any other form that is convenient for the browser implementation. For example, it could transcode it into UTF-16, UTF-8, EBCDIC, whatever. That is an implementation detail unrelated to interpreting the incoming content. G.
Received on Monday, 5 December 2011 18:01:15 UTC