Re: what's the language of a document ?

On Sun, 25 Oct 2009, Divya Manian wrote:
> Internationalization best practices [1] states:
> ³Where a document contains content aimed at speakers of more than one 
> language, use Content-Language with a comma-separated list of language 
> tags.²
> The HTML 5 specs [2] state:
> ³Šthere is a document-wide default language set, then that is the 
> language of the node.
> If there is no document-wide default language, then language information 
> from a higher-level protocol (such as HTTP), if any, must be used as the 
> final fallback language. In the absence of any language information, the 
> default value is unknown (the empty string).²
> What is not clear is, what happens if a HTML document has a HTTP header
> Content-Language has a comma-separated list of language tags and no other
> language declarations? I found on a thread [3] that states such a document
> will be declared to use "unknown" language in this case. It would be good to
> have this case explicitly stated.

I've updated the spec to say that when the higher-level protocol reports 
multiple languages, they are all ignored in favour of the default 

On Sun, 25 Oct 2009, Martin Kliehm wrote:
> Also in XHTML notation empty strings are disallowed, so the default 
> value for "unknown" would be in that case "und". [4]

On Sun, 25 Oct 2009, John Cowan wrote:
> Why would empty strings be disallowed in xml:lang attributes?  I can 
> find no indication of that in XHTML 1.0.

In HTML5, the "unknown" value is the empty string (for "lang"). The 
xml:lang attribute is defined by the XML spec.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 27 October 2009 01:30:58 UTC