Re: precedence of xml:lang and lang?

(Adding ISSUE-10 here so that the tracker would pick the mail up...)

Thanks Toby,

as you say, this is a bit annoying indeed. The problem is that RDFa
processors do not only operate on the Web as part of a browser (ie,
where the media type is clear) but also on local files were all this is
a bit less clear.

That being said, it is clearly an authoring error if someone puts, on
the same element, a @lang and @xml:lang values with different values
(modulo case). Ie, RDFa should not go out of its way to handle all
corner cases...

So, I propose that

- we stipulate that, formally, in case both @xml:lang and @lang appears
on the same element, @xml:lang hs the precedence
- we add to the document somewhere as a warning that it if both values
are set and they are different then unexpected differences can occur
depending on the media type used for that particular document on the Web
and it should be avoided (friendly RDFa processors can even add a
warning in the result:-)

I do not think RDFa should say that an @xml:lang without @lang is not
conformant; that is something that the underlying language specification
should handle and not RDFa.

Ivan

On 2010-2-26 10:46 , Toby Inkster wrote:
> On Fri, 2010-02-26 at 08:43 +0100, Ivan Herman wrote:
>> I tried to look at the (X)HTML5 document, I did find a reference to
>> xml:lang in 7.03[3], but I did not find any reference to the question
>> of relative precedence. I must admit I am not very familiar with the
>> HTML5 document structure, so I may have missed it. 
> 
> The relevant section of the latest HTML5 working draft (25/08/09) is
> 3.2.3.3.
> 
> In DOM terms, there are three attributes of relevance in HTML5 (and here
> I'm excluding the Content-Language HTTP header and <meta http-equiv>
> equivalent of it, which as I understand it, are still being debated).
> Written in Clark notation, they're:
> 
> 1. {http://www.w3.org/XML/1998/namespace}lang
> 2. {}lang
> 3. {}xml:lang
> 
> Note that #1 and #3 are each the result of parsing an attribute called
> 'xml:lang'. Parsing under XML rules yields #1, and under HTML rules
> yields #3.
> 
> In terms of declaring the language of an element, #1 has precedence
> (just like it does in XHTML 1) over #2. #3 is ignored.
> 
> However, for HTML documents (i.e. those sent as text/html), no
> attributes will ever be parsed as #1. (I believe #1 attributes can still
> be created via client-side scripts.) While the precedence rules are the
> same in HTML and XHTML, because HTML parsing has the effect of never
> generating #1 attributes and generating #3 instead, effectively
> 'xml:lang' is always ignored.
> 
> This is somewhat annoying, given that it can result in different
> behaviour in HTML and XML processing modes.
> 
> That said, for HTML documents, it is a conformance error to set an
> 'xml:lang' attribute without also providing a 'lang' attribute which is
> a case-sensitive match. So at least this problem should be picked up by
> validators.
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF   : http://www.ivan-herman.net/foaf.rdf
vCard  : http://www.ivan-herman.net/HermanIvan.vcf

Received on Friday, 26 February 2010 10:04:31 UTC