[whatwg] [WA1] lang and xml:lang

On Sun, 17 Apr 2005, Lachlan Hunt wrote:
> > > 
> > > # If both the xml:lang attribute and the lang attribute are set, user
> > > # agents must use the xml:lang attribute, and the lang attribute must be
> > > # ignored for the purposes of determining the element's language.
> > > 
> > > Is that the case for both HTML and XHTML documents?
> > 
> > Yes.
> 
> So, if I have this HTML document
> 
>   <!DOCTYPE ...>
>   <html lang="en" xml:lang="fr">
>   <title>HTML document</title>
>   <p>This is an HTML, not an XML, document.
> 
> Considering that legacy HTML UAs won't know about the xml:lang 
> attribute, and will only use lang, are you saying that a conforming Web 
> Apps UA should treat the document as french?

No. The "xml:lang" attribute in that document is not the xml:lang 
attribute. It's the {null, "xml:lang"} attribute -- the attribute in the 
null namespace with the local name "xml:lang" -- whereas the xml:lang 
attribute, the one defined by XML, is the {xml, "lang"} attribute: the 
attribute in the XML namespace with the local name "lang".

See Namespaces in XML for more information.



> > > It would make more sense if, in the case of both being set, lang was 
> > > used for text/html documents and xml:lang for XML documents.
> > 
> > The only way you can set xml:lang in an HTML document is via the DOM
> 
> Now I'm confused.  If that's the case, then wouldn't the above example 
> be treated as english [...]

Yes.


> > (in HTML, there are no namespaces).
> 
> Which is why xml:lang should be completely ignored, as an unknown 
> attribute, in HTML.

If there is a literal "xml:lang" attribute in an HTML document, it is 
ignored and has no effect on this conformance requirement. That, however, 
is not an xml:lang attribute.

Since this is clearly a source of confusion, I've added a paragraph to the 
Terminology section about this.


> I've seen people use lots of XML syntax in HTML documents, including 
> xmlns and xml:lang attributes even in one that had an explicit HTML4 
> DOCTYPE (though I can't remember where) and not just in MS Word 
> generated rubbish.  The point is authors do a lot of silly things, and I 
> thought UA behaviour needed to be well defined for as many use cases as 
> possible.

Absolutely. However none of the cases you mentioned result in the 
existence of a "lang" attribute in the XML namespace. They result in 
unknown attributes in the null namespace, which is very different.


> > > However, in the case of only one being set but for the wrong MIME 
> > > type (eg. xml:lang set for text/html document or lang for XML 
> > > document), for error recovery, should UAs be allowed to fallback on 
> > > it?
> > 
> > If xml:lang="" is set onin a text/html document, it'll be {html, 
> > 'xml:lang'}, not {xml, 'lang'} which is what xml:lang really is.

(Er, I should have said {null, 'xml:lang'}, not {html, 'xml:lang'}.)

> I don't understand how that answers the question.

I hope this e-mail clarifies it for you.

Cheers,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Sunday, 17 April 2005 06:16:43 UTC