Re: Language declarations in XHTML 1.1

Richard Ishida 2008-09-26 19.25:

> xml:lang is not recognized [...] by major user agents that 
> process text/html, and the HTML5 spec currently recognizes only
>  the lang attribute [...] although it allows xml:lang.

When arguing for change in HTML 5 in order to benefit XHTML 
autors, then keep in mind that many in the HTML 5 wg consider that 
XHTML served as text/html is bad and that xml:lang was allowed as 
a "harmless artefact" to make it simpler to switch to HTML 5. (For 
others in the group, "being able to serve both as text/html and as 
application/xhtml-xml" might have been motivation, as well.)

> The upshot of this is that there is no way of effectively
> declaring language in XHTML 1.1 documents served as text/html.


> One approach to this issue would be to add a lang attribute to
> XHTML 1.1, but you would have to get authors to continue to use
> both lang and xml:lang for such documents so that language is
> recognized in both XML and XML contexts.  This is already a
> nuisance for authors who use XHTML 1.0.  I do not expect that
> XML applications would begin to recognize the lang attribute,
> so it would be there purely for compatibility with HTML.

To sum up this approach: Change XHTML 1.1 so one can hand the 
problem over to the authors - the same way as XHTML 1.0 did.

> The other approach would be for user agents to recognize that
> an xml:lang element is saying the same thing as a lang
> attribute, and to specify that equivalence in HTML5.  This
> would also make life easier for authors using any flavor of
> XHTML, since they would only need to specify language in a
> single attribute (xml:lang) and it would work in both XML and
> XML contexts.

To sum up this approach: Change the text/html User Agents by 
requiring in HTML 5 that they treat xml:lang the same way as lang.

> This is my proposed solution.  I know that that that then pulls
> in questions about the use of the xml:lang namespace or not,
> and what to do with a lang and xml:lang attribute on the same
> element with different values, but those are second-order
> questions in my mind.

To sum up: When served as text/html, XHTML authors must accept 
that xml:lang="" is given a text/html lang="" interpretation.

> Do we recommend this to the HTML WG?

I tried to think about what needs to change in order to achieve 
the bonus you are after: that XHTML authors (and those who want to 
write HTML in a directly XHTML re-usable way) should not need to 
duplicate xml:lang with lang:

  * HTML 5 must be changed:
    - authors may select freely between xml:lang or lang,
    - however, they must make a choice!
      They cannot use both xml:lang and lang in same document.
  * XHTML 1.0 must be made XHTML 1.1 compatible:
    - a new revision of XHTML 1.0 must make lang="" illegal;
  * text/html User Agents must be updated as soon as possible:
    - they must use xml:lang as fallback for lang.

Ideally, I think that xml:lang should be forbiden in HTML 5 - 
though User Agents should still be required to support xml:lang. 
Then XHTML 1.1. would be covered. However, such an approach seems 
unrealistic. (To ask that authors replace xml:lang with lang and 
vice versa, is a smll burden, compared with the burdo of using two 
identical attribtues.)

Thus the alternative is to make xml:lang and lang a free choice. 
For HTML 5 authors, this would only be a simplification whenever 
they need to serve the saem source both as application/xhtml+xml 
and as text/html.

The most important problem is of course is the User Agents: How 
far are we from a situation where all UAs support xml:lang? Is 
Internet Explorer the only bad sheep?
leif halvard silli

Received on Saturday, 27 September 2008 01:22:40 UTC