W3C home > Mailing lists > Public > public-html@w3.org > August 2008

Re: meta content-language

From: Mark Davis <mark.davis@icu-project.org>
Date: Fri, 22 Aug 2008 07:04:26 -0700
Message-ID: <30b660a20808220704o26aa961q29802746b014f9cf@mail.gmail.com>
To: "Julian Reschke" <julian.reschke@gmx.de>
Cc: "Leif Halvard Silli" <lhs@malform.no>, "Ian Hickson" <ian@hixie.ch>, "HTML WG" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>
I'm kinda lost in this thread so far. It seems to me the questions at had
are:
1. Distinction in Language. Should there be a distinction in interpretation
between the language set via lang attribute and meta content?
<html lang="foo">
and
<meta http-equiv="Content-Language" content="foo"/>

My take is that any such distinction would be a departure from current
practice, and too fine a distinction for the vast majority of people to be
able to follow.


2. Language Inheritance. If there are conflicting languages, what should
win? (or in other words, what's the inheritance?)

(HTTP) Content-Language: lang1
<meta http-equiv="Content-Language" content="lang2"/>
<html lang="lang4" xml:lang="lang3">
<p lang="lang5">
My take is that HTML5 has it right, that the winner/inheritance should be in
the above order: lang5 wins over lang4 over lang3 over lang2 over lang1.


3. Language Values. Should the value of any of these fields be a single
language tag or also allow a priority list (both as defined by BCP47)?

Note that it can be zero (""), which is equivalent to "und" (Unknown
language) in BCP 47.

Here I think we'd be somewhat better off if the value could be a priority
list, eg "de, fr, en". For example, if the html lang value were "de, fr,
en", that would mean that there wasn't any substantial amount of linguistic
content other than these three, and that the relationship was de >= fr >=
en. Due to the ordering, if you had software that could only handle a single
language, then de would be that value.

Documents may contain a mixture of languages, and allowing them to be tagged
at a high level with a priority list would allow people to reflect that
reality without having to tag each and every element with the right
language. Software can make use of that information, for example, in ranking
the document with respect to the language of search queries. With a search
query in "fr", a document with html lang of "de, fr" could be treated
differently than if it just had "de".

However, that may be too big a departure from current practice.

Mark
Received on Friday, 22 August 2008 14:05:02 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:57 UTC