RE: meta content-language

--- On Thu, 28/8/08, Phillips, Addison <addison@amazon.com> wrote:

> From: Phillips, Addison <addison@amazon.com>
> Subject: RE: meta content-language
> To: "mtcarrascob@yahoo.com" <mtcarrascob@yahoo.com>, "www-international@w3.org" <www-international@w3.org>
> Date: Thursday, 28 August, 2008, 11:11 PM


1) REQUIREMENT

"13.5.1 Language(s) labelling 
There are two types of language labelling: 
 
* Language(s) of the file: the metadata indicating the language(s) in a file. It could be several languages. In the Dublin Core is the label language; also called language of the intended audience. 
 
* Processing language: the language at any given point in the file. It can be only one language. It is particularly important in the case of multilingual files. In XML, normally the attribute xml:lang. Also called text-processing language."

From
 Open architecture for multilingual parallel texts
 http://arxiv.org/pdf/0808.3889v1


2) SYNTAX

 2.1) Language(s) of the file
  2.1.1) One language: meta or HTML attribute
  2.1.2) Several languages: only meta

 2.2) Processing language inheritance
   Scheme ("http", but also from "file")
   meta
   lang attribute in the top HTML element
   lang attributes in the other elements going down.


4) RATIONALE

"The reason for recommending META as opposed to the HTML element with the lang attribute are:

* N-lingual document could be specified. For example, a bilingual French/Spanish document can be specified.

* The language(s) would be transmitted in the Content-Language field of HTTP header."

From
 http://www.w3.org/TR/1998/NOTE-html-lan-19980313

i.e., meta works for all the cases.


5) "INTENDED AUDIENCE" is a confusing term

"This should simply be named 'primary language' of something similar.

'Intended audience' is confusing. For example, if the document is
written in simple French intended for an audience of English speakers
learning French, it must be labelled 'fr' (the language of the
document) and not "en" (the  intended audience).

The meaning is as the 'Language' element in the Dublinc Core: 

 'The language of the intellectual content of the resource.'

From:
 http://lists.w3.org/Archives/Public/www-international/2006JulSep/0024.html


Regards
Tomas


> > 3) The HTML attribute can contain only one language,
> hence one
> > cannot label multilingual files.
> > 
> The HTML attribute can contain only one language.
> That's because any given sequence of human-readable
> (natural language) text will be in one language, even in a
> multilingual document and <html> is the outermost
> element in an HTML document (thus, the default text
> processing language for that document). Embedded elements,
> including <span>, are used to indicate runs of other
> languages. 
> 
> Hence all this discussion of the difference between the
> metadata about the intended audience of the whole document
> (such as <meta>) and the document processing language
> (which applies to spans or runs of text within that
> document). The language attributes of the html element can
> be used perfectly well with a multilingual file, but they do
> NOT declare what language may occur within that document.
> 
> Addison
> 
> Addison Phillips
> Globalization Architect -- Lab126
> 
> Internationalization is not a feature.
> It is an architecture.

Received on Friday, 29 August 2008 12:12:08 UTC