Re: Language declarations in XHTML 1.1

I have some charts in http://www.macchiato.com/ ("Unicode at Google" from
the Unicode conference).
Although growing, currently the number of pages tagged with lang are too
small to be useful. And the number that are tagged, a very high percentage
are tagged incorrectly (according to our analyses), and sometimes
inconsistently. (If we normalize the tags using the CLDR LikelySubtags data,
then both numbers get better.)

Mark


On Tue, Sep 30, 2008 at 12:51 PM, Texin, Tex <Tex.Texin@netapp.com> wrote:

> Richard identifies:
> a) XHTML 1.1 supports only an xml:lang attribute.
>
> b) xml:lang is not recognized by major user agents that process text/html
>
> c) HTML5 recognizes only the lang attribute for language declaration. It
> allows xml:lang.
>
> Therefore:
> there is no way of effectively declaring language in XHTML 1.1 documents
> served as text/html.
>
> Tex:
> Forcing UA to support xml:lang and also driving HTML5 to support xml:lang,
> seems like a bigger effort than adding lang to XHTML 1.1.
> XHTML 1.1 has fewer users and UA that support it probably already support
> lang.
>
> Add lang to XHTML 1.1 and yes make it equivalent to xml:lang.
>
> For HTML5, there should be a rule that if you have both lang and xml:lang,
> that lang should take precedence.
> This is equivalent to recognizing lang and allowing xml:lang.
> The only case that it doesn't cover is if you have xml:lang without lang.
> Is this worth worrying about in html?
>
> Do we have any statistics on meaningful usage of lang?
> (By Meaningful, I am referring to usage in multilingual documents or
> labeling of a document where it is helpful to applications that wouldn't
> otherwise be able to detect the language)
>
>
>
> -----Original Message-----
> From: Richard Ishida [mailto:ishida@w3.org]
> Sent: Friday, September 26, 2008 10:25 AM
> To: www-international@w3.org
> Subject: Language declarations in XHTML 1.1
>
>
> The 2nd edition version of XHMTL 1.1 says:
>
> "XHTML 1.1 documents SHOULD be labeled with the Internet Media Type
> text/html as defined in [RFC2854] or application/xhtml+xml as defined in
> [RFC3236]." [1]
>
> Unlike XHTML 1.0, however, XHTML 1.1 does not define a lang attribute, only
> an xml:lang attribute.
>
> xml:lang is not recognized for language declaration by major user agents
> that process text/html, and the HTML5 spec currently recognizes only the
> lang attribute for language declaration, although it allows xml:lang.
>
> The upshot of this is that there is no way of effectively declaring
> language in XHTML 1.1 documents served as text/html.
>
>
> One approach to this issue would be to add a lang attribute to XHTML 1.1,
> but you would have to get authors to continue to use both lang and xml:lang
> for such documents so that language is recognized in both XML and XML
> contexts.  This is already a nuisance for authors who use XHTML 1.0.  I do
> not expect that XML applications would begin to recognize the lang
> attribute, so it would be there purely for compatibility with HTML.
>
> The other approach would be for user agents to recognize that an xml:lang
> element is saying the same thing as a lang attribute, and to specify that
> equivalence in HTML5.  This would also make life easier for authors using
> any flavor of XHTML, since they would only need to specify language in a
> single attribute (xml:lang) and it would work in both XML and XML contexts.
>
> This is my proposed solution.  I know that that that then pulls in
> questions about the use of the xml:lang namespace or not, and what to do
> with a lang and xml:lang attribute on the same element with different
> values, but those are second-order questions in my mind.
>
> Do we recommend this to the HTML WG?
>
> RI
>
>
>
> [1] http://www.w3.org/TR/xhtml11/conformance.html
>
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/International/
> http://rishida.net/
>
>
>
>
>
>

Received on Tuesday, 30 September 2008 16:31:37 UTC