Re: xml:lang in the light of error frequencies

On Tue, 12 Aug 2008 12:12:39 +0200, Ian Hickson <ian@hixie.ch> wrote:

> On Sun, 16 Mar 2008, Henri Sivonen wrote:
>>
>> This is the email pointing out xml:lang.
>>
>> From the start of February to the middle of March, 15% of unique URLs
>> checked as (X)HTML5 by Validator.nu had an erroneous xml:lang attribute
>> in text/html. (Note that since the divisor contains the XHTML5 pages as
>> well, the percentage for HTML5 must be even higher.)
>
> As I see it we have four options:
>
>  * Ignore xml:lang in text/html, making it non-conforming to warn that it
>    is being ignored.
>
>  * Ignore xml:lang in text/html, but if it is present and has the same
>    value as lang="", allow it to be present.

I wonder which is more harmful: authors wasting time removing xml:lang  
 from templates in order to validate or authors wasting time adding  
xml:lang because it's allowed. Personally I think we shouldn't  
inconvenience authors who validate for harmless stuff, and hence, I prefer  
the second option above.


>  * Have the parser perform namespace magic on it. This would be the first
>    time a non-foreign-content attribute had namespace magic performed,  
> and
>    it would mean that getAttribute('xml:lang') and  
> setAttribute('xml:lang')
>    would not work as most authors would expect.

I wouldn't be surprised if this would break pages.


>  * Have the language processing add a fifth way to process images, the
>    third way specific to elements -- {}lang on HTML elements, {xml}lang  
> on
>    all elements, and now introducing {}xml:lang on HTML elements.
>
> The list above is ordered from my least disliked option first to my most
> disliked option last.
>
> Thus I propose not changing this behaviour, despite the frequency of the
> error.
>
> We could, if people really want to continue the ridiculous practice of
> writing polyglot documents, allow lang="" in HTML documents, thus

(You mean XHTML documents?)

> providing a conforming way to set the language that is allowed in both
> forms. But I'm not a big fan of that either, since we'd also have to add  
> a
> requirement that it match xml:lang="" if both were present.

Doesn't seem like a huge burden for you and Henri. :-) I think this would  
be a good idea too, FWIW.

-- 
Simon Pieters
Opera Software

Received on Tuesday, 12 August 2008 11:20:52 UTC