RE: what's the language of a document ?

On Mon, 26 Oct 2009, Tex Texin wrote:
> 
> So if someone attempts to be specific and declares content-language to 
> be "es-mx,es-ar" for mexico and argentina, or perhaps declares "en, 
> en-us" then that information is thrown away in favor of unknown?

For the purposes of the CSS :lang() selector, conversion to RDF, and UA 
built-in spelling checkers, yes. The information is still conveyed by the 
HTTP headers, though, and can be used for whatever purposes the HTTP 
headers are intended for.


> Also, does this change to the document default language impact just html 
> behavior, or embedded scripting languages as well?

I don't understand what it would mean to affect embedded scripting 
languages; can you elaborate?


> If there were code that checks for language and performs different 
> actions based on languages in the document, that is affected as well?

That depends on how it checks for language.


> Why does the default need to be monolingual?

It's not that the default is monolingual, so much as the model used by 
HTML has a single langauge per Element node. HTML itself supports multiple 
languages, but not in the vague "there are multiple languages present" 
sense, only at the specific per-element level. This is compatible with all 
the systems I'm aware of except HTTP. For example, RDF only supports one 
language per text literal, and spelling checkers generally expect a single 
language per word.

In fact, based on what I've seen of the way the relevant HTTP headers are 
used, I would personally recommend just changing the HTTP spec to only 
allow one language there also, since few people use this to specify 
multiple languages, and I'm not aware of any software that makes use of 
this information.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 27 October 2009 05:14:32 UTC