Language declarations in XHTML 1.1

The 2nd edition version of XHMTL 1.1 says:

"XHTML 1.1 documents SHOULD be labeled with the Internet Media Type text/html as defined in [RFC2854] or application/xhtml+xml as defined in [RFC3236]." [1]

Unlike XHTML 1.0, however, XHTML 1.1 does not define a lang attribute, only an xml:lang attribute.

xml:lang is not recognized for language declaration by major user agents that process text/html, and the HTML5 spec currently recognizes only the lang attribute for language declaration, although it allows xml:lang.

The upshot of this is that there is no way of effectively declaring language in XHTML 1.1 documents served as text/html.


One approach to this issue would be to add a lang attribute to XHTML 1.1, but you would have to get authors to continue to use both lang and xml:lang for such documents so that language is recognized in both XML and XML contexts.  This is already a nuisance for authors who use XHTML 1.0.  I do not expect that XML applications would begin to recognize the lang attribute, so it would be there purely for compatibility with HTML.

The other approach would be for user agents to recognize that an xml:lang element is saying the same thing as a lang attribute, and to specify that equivalence in HTML5.  This would also make life easier for authors using any flavor of XHTML, since they would only need to specify language in a single attribute (xml:lang) and it would work in both XML and XML contexts.

This is my proposed solution.  I know that that that then pulls in questions about the use of the xml:lang namespace or not, and what to do with a lang and xml:lang attribute on the same element with different values, but those are second-order questions in my mind.

Do we recommend this to the HTML WG?

RI



[1] http://www.w3.org/TR/xhtml11/conformance.html

============
Richard Ishida
Internationalization Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/

Received on Friday, 26 September 2008 17:29:36 UTC