W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

Re: How do I say 'this is not in any language' in XHTML/HTML

From: Jon Hanna <jon@hackcraft.net>
Date: Thu, 22 Mar 2007 14:38:34 +0000
Message-ID: <460294EA.4060407@hackcraft.net>
To: Richard Ishida <ishida@w3.org>
CC: www-international@w3.org

Richard Ishida wrote:
> I'm still not clear about the distinction between xml:lang="" and xml:lang="und".  Any suggestions?

If xml:lang is spec'd in a particular schema to allow an empty string 
then xml:lang="und" is a bug and xml:lang="" is not.

If it is not spec'd to allow an empty string then xml:lang="und" is not 
a bug and xml:lang="" is!

RFC 4646, like RFC 3066 before it expliclty states that und SHOULD not 
be used unless a protocol forces one to state a language tag. Since 
xml:lang does not force any use and is specified as stating that the 
empty string is allowed unless another specification (e.g. XHTML1.0) 
says otherwise.

RFC 4646, again lke RFC 3066 before it, states that the lack of a 
language code means Undetermined (just as und does in a protocol that 
doesn't allow an empty language code).

I agree with those who consider XHTML1.0 not allowing an empty xml:lang 
attribute value as obsolete (or an error? Did the first edition of the 
XML1.0 spec prohibit empty xml:lang?).

Both of these cover cases where the language is not known. If it is 
*known* that content does not contain any linguistic data then 
xml:lang="zxx" should be used.
Received on Thursday, 22 March 2007 14:40:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT