- From: Robert Burns <rob@robburns.com>
- Date: Tue, 10 Jul 2007 04:51:32 -0500
- To: Andrew Sidwell <takkaria@gmail.com>
- Cc: HTML Working Group <public-html@w3.org>
On Jul 9, 2007, at 6:40 AM, Andrew Sidwell wrote: > > Robert Burns wrote: >> >> Charset attribute >> >> Suggest adding charset attribute to root element rather than adding a >> charset name to the <meta> element. This will be easier for >> authors to >> use. It will also be easier for UAs to pre-parse. (Rob Burns) > > The rationale for the <meta charset=""> attribute combination is that > UAs already implement it, because people tend to leave things > unquoted, > like: > > <META HTTP-EQUIV=Content-Type CONTENT=text/html; charset=ISO-8859-1> > > Thus, adding an element to the root element would add yet another > place > UAs have to check for charset data. Thanks for that information. I had suspected that might be part of the motivation. My suggestion is not meant to overturn that practice. Certainly UAs should continue to use BOMs and: <meta http-equiv="content-type" content="text/html; charset=utf-8" > or even: <meta; charset="utf-8" > if that's what they already do. My suggestion arose from the concern that the meta element with the charset attribute should be the first element in the head. I'm curious is that how many of the current UAs work? In other words, do current UAs stop at the first meta in searching for encoding hints? If that's the case, that's not something I've heard before. In any event, my suggestion arose for several reasons. First, I think the text encoding situation is such a nagging problem still after all of these years. Second,, until it is handled exclusively through BOMs (or some other special character, if ever), its going to require extra attention in educating authors about something particularly esoteric that many do not understand. Its perhaps one of the few places in the document where you can change semantics to an incompatible state: i.e., setting the charset to something that doesn't reflect the encoding of the document. This is very different from the places where one might incorrectly set the hinting for an hreflang or the like. Its also different than the content type in that MIME type is not as integral to the actual bits of the document as charset. Second, for setting a value for the encoding that needs to appear early in the document and a value that can be contained as an attribute value, it makes a lot of sense to include that as an attribute on the root element. Pre-parsers will be able to find the value more easily and documents will not face the risk of the the meta element further down in the head. Also there will be less author error in placing the meta element in the incorrect order. This is therefore a suggestion for long-term authoring conformance criterion. Obviously it only applies to the text/html serialization. If that's not expected to last for in the long-term, then I think its probably not worth promoting a solution like this. Take care, Rob
Received on Tuesday, 10 July 2007 09:51:43 UTC