Re: review of "The root element" subsection from Robert Burns on 2007-07-10 (public-html@w3.org from July 2007)

From: Robert Burns <rob@robburns.com>
Date: Tue, 10 Jul 2007 05:43:44 -0500
To: Simon Pieters <simonp@opera.com>
Cc: "Andrew Sidwell" <takkaria@gmail.com>, "HTML Working Group" <public-html@w3.org>
Message-Id: <2B346758-7128-4486-A3E7-F85C2160333E@robburns.com>

On Jul 10, 2007, at 5:13 AM, Simon Pieters wrote:

>
> On Tue, 10 Jul 2007 11:51:32 +0200, Robert Burns <rob@robburns.com>  
> wrote:
>
>> My suggestion arose from the concern that the meta element with  
>> the charset attribute should be the first element in the head. I'm  
>> curious is that how many of the current UAs work? In other words,  
>> do current UAs stop at the first meta in searching for encoding  
>> hints?
>
> No. It is not a requirement for UAs. The requirements for UAs are:
>
>    http://www.whatwg.org/specs/web-apps/current-work/#determining0

I wasn't asking about the UA requirements, I was asking if there was  
any research on the current behavior (that we're trying to be  
backwards compatible with).

>> [...]
>>
>> Second, for setting a value for the encoding that needs to appear  
>> early in the document and a value that can be contained as an  
>> attribute value, it makes a lot of sense to include that as an  
>> attribute on the root element.
>
> Perhaps, but it isn't compatible with existing UAs.

Do we already have some tests on this?

>> Pre-parsers will be able to find the value more easily
>
> Not really. They still have to look for encoding information in  
> meta elements too. Adding more places they have to look doesn't  
> make it simpler.

Well adding:

A sequence of bytes starting with: 0x3C, 0x68 or 0x48, 0x54 or 0x74,  
0x4D or 0x6D, 0x4C or 0x6C, and finally one of 0x09, 0x0A, 0x0B,  
0x0C, 0x0D, 0x20 (case-insensitive ASCII '<html' followed by a space)

doesn't seem to be that much of hardship: neither adding it to the  
pseudo part of the spec, nor the methods already in a UA preparser.

>> and documents will not face the risk of the the meta element  
>> further down in the head.
>
> How does requiring an attribute on the root instead of on a meta  
> element that is first child of head reduce the risk of the encoding  
> information being in the wrong place?

Its just simpler to deal with attributes. When a separate element  
doesn't add anything to the expressiveness of the language it simply  
adds complexity and room for authoring error.

>> Also there will be less author error in placing the meta element  
>> in the incorrect order.
>
> How can you tell?

If you can tell me how you can tell it there won'' be less error,  
then I'd have an easier time responding to the question. Is there  
some reason for the straight nay-saying?

>> This is therefore a suggestion for long-term authoring conformance  
>> criterion. Obviously it only applies to the text/html  
>> serialization. If that's not expected to last for in the long- 
>> term, then I think its probably not worth promoting a solution  
>> like this.
>
> The benefits seem weak to me compared to the drawbacks (not  
> compatible with existing UAs, complicates implementation).

It hardly complicates the implementations. I would agree that the XML  
serialization has already solved this in a much more elegant manner.  
And since this is a long-term solution, the text/html serialization  
might disappear before anyone had the chance to take advantage of  
this new feature. It's only strength would come from text/html  
sticking around for some length of time.

Take care,
Rob

Received on Tuesday, 10 July 2007 10:44:08 UTC