Re: xml:lang in the light of error frequencies from Ian Hickson on 2008-08-12 (public-html@w3.org from August 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 12 Aug 2008 10:12:39 +0000 (UTC)
To: Henri Sivonen <hsivonen@iki.fi>
Cc: HTML WG <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0808121005020.5140@hixie.dreamhostps.com>

On Sun, 16 Mar 2008, Henri Sivonen wrote:
>
> This is the email pointing out xml:lang.
> 
> From the start of February to the middle of March, 15% of unique URLs 
> checked as (X)HTML5 by Validator.nu had an erroneous xml:lang attribute 
> in text/html. (Note that since the divisor contains the XHTML5 pages as 
> well, the percentage for HTML5 must be even higher.)

As I see it we have four options:

 * Ignore xml:lang in text/html, making it non-conforming to warn that it 
   is being ignored.

 * Ignore xml:lang in text/html, but if it is present and has the same 
   value as lang="", allow it to be present.

 * Have the parser perform namespace magic on it. This would be the first 
   time a non-foreign-content attribute had namespace magic performed, and 
   it would mean that getAttribute('xml:lang') and setAttribute('xml:lang')
   would not work as most authors would expect.

 * Have the language processing add a fifth way to process images, the 
   third way specific to elements -- {}lang on HTML elements, {xml}lang on 
   all elements, and now introducing {}xml:lang on HTML elements.

The list above is ordered from my least disliked option first to my most 
disliked option last.

Thus I propose not changing this behaviour, despite the frequency of the 
error.

We could, if people really want to continue the ridiculous practice of 
writing polyglot documents, allow lang="" in HTML documents, thus 
providing a conforming way to set the language that is allowed in both 
forms. But I'm not a big fan of that either, since we'd also have to add a 
requirement that it match xml:lang="" if both were present.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 12 August 2008 10:13:15 UTC