- From: Mark Davis <mark.davis@icu-project.org>
- Date: Fri, 22 Aug 2008 11:30:12 -0700
- To: "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "Leif Halvard Silli" <lhs@malform.no>, "Ian Hickson" <ian@hixie.ch>, "HTML WG" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>
- Message-ID: <30b660a20808221130w3edd45fay755713a5cea5e99f@mail.gmail.com>
Mark On Fri, Aug 22, 2008 at 7:04 AM, Mark Davis <mark.davis@icu-project.org>wrote: > I'm kinda lost in this thread so far. It seems to me the questions at had > are: > 1. Distinction in Language. Should there be a distinction in interpretation > between the language set via lang attribute and meta content? > <html lang="foo"> > and > <meta http-equiv="Content-Language" content="foo"/> > > My take is that any such distinction would be a departure from current > practice, and too fine a distinction for the vast majority of people to be > able to follow. > > > 2. Language Inheritance. If there are conflicting languages, what should > win? (or in other words, what's the inheritance?) > > (HTTP) Content-Language: lang1 > <meta http-equiv="Content-Language" content="lang2"/> > <html lang="lang4" xml:lang="lang3"> > <p lang="lang5"> > My take is that HTML5 has it right, that the winner/inheritance should be > in the above order: lang5 wins over lang4 over lang3 over lang2 over lang1. > > > 3. Language Values. Should the value of any of these fields be a single > language tag or also allow a priority list (both as defined by BCP47)? > > Note that it can be zero (""), which is equivalent to "und" (Unknown > language) in BCP 47. > > Here I think we'd be somewhat better off if the value could be a priority > list, eg "de, fr, en". For example, if the html lang value were "de, fr, > en", that would mean that there wasn't any substantial amount of linguistic > content other than these three, and that the relationship was de >= fr >= > en. Due to the ordering, if you had software that could only handle a single > language, then de would be that value. > > Documents may contain a mixture of languages, and allowing them to be > tagged at a high level with a priority list would allow people to reflect > that reality without having to tag each and every element with the right > language. Software can make use of that information, for example, in ranking > the document with respect to the language of search queries. With a search > query in "fr", a document with html lang of "de, fr" could be treated > differently than if it just had "de". > A clarification: the first two items already take a priority list: (HTTP) Content-Language: lang1 <meta http-equiv="Content-Language" content="lang2"/> It is the lang="..." and xml:lang="..." that currently lack the ability (according to the spec) to have multiple languages. > However, that may be too big a departure from current practice. > > Mark >
Received on Friday, 22 August 2008 18:30:49 UTC