Re: precedence of xml:lang and lang?

On Fri, 2010-02-26 at 08:43 +0100, Ivan Herman wrote:
> I tried to look at the (X)HTML5 document, I did find a reference to
> xml:lang in 7.03[3], but I did not find any reference to the question
> of relative precedence. I must admit I am not very familiar with the
> HTML5 document structure, so I may have missed it. 

The relevant section of the latest HTML5 working draft (25/08/09) is
3.2.3.3.

In DOM terms, there are three attributes of relevance in HTML5 (and here
I'm excluding the Content-Language HTTP header and <meta http-equiv>
equivalent of it, which as I understand it, are still being debated).
Written in Clark notation, they're:

1. {http://www.w3.org/XML/1998/namespace}lang
2. {}lang
3. {}xml:lang

Note that #1 and #3 are each the result of parsing an attribute called
'xml:lang'. Parsing under XML rules yields #1, and under HTML rules
yields #3.

In terms of declaring the language of an element, #1 has precedence
(just like it does in XHTML 1) over #2. #3 is ignored.

However, for HTML documents (i.e. those sent as text/html), no
attributes will ever be parsed as #1. (I believe #1 attributes can still
be created via client-side scripts.) While the precedence rules are the
same in HTML and XHTML, because HTML parsing has the effect of never
generating #1 attributes and generating #3 instead, effectively
'xml:lang' is always ignored.

This is somewhat annoying, given that it can result in different
behaviour in HTML and XML processing modes.

That said, for HTML documents, it is a conformance error to set an
'xml:lang' attribute without also providing a 'lang' attribute which is
a case-sensitive match. So at least this problem should be picked up by
validators.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Received on Friday, 26 February 2010 09:47:05 UTC