- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Thu, 30 Sep 2010 14:06:30 +0200
- To: Noah Mendelsohn <noah@arcanedomain.com>
- CC: "www-tag@w3.org" <www-tag@w3.org>
On 30.09.2010 09:56, Julian Reschke wrote: > On 29.09.2010 23:29, Noah Mendelsohn wrote: >> I notice that there is active discussion in two HTML5-related Bugzilla >> entries [1,2] of details related to charset detection. I'm not up on the >> details, but at least the title of [2] suggests that charset sniffing is >> involved (to my untrained eye, most of the debate seems to be about >> parsing of charset parameters). Anyway, given the TAG's ongoing interest >> in adherence to HTTP specifications in general, and sniffing in >> particular, I thought I'd point these out. >> >> Noah >> >> [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=9628 >> [2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=10804 > > ...and > > http://www.w3.org/Bugs/Public/show_bug.cgi?id=10805 > > The background is that HTML5 specifies an algorithm for extracting the > charset from content type information, which (1) requires accepting > invalid forms (single quotes), and (2) requires not to properly handle > escapes in quoted strings. > > The spec claims it's needed for legacy content, but for both cases there > are examples of UAs that do not implement this today; so that claim is > really really weak. Bugs 10804 and 10805 have been rejected, so I have raised issues http://www.w3.org/html/wg/tracker/issues/125 and http://www.w3.org/html/wg/tracker/issues/126. Bug 9628 (which asks for clarification what the incompatibility with RFC 2616 is) *has* been fixed, which is a good thing. I wish the spec did the same for all other "willful violations". Best regards, Julian
Received on Thursday, 30 September 2010 12:13:49 UTC