W3C home > Mailing lists > Public > www-international@w3.org > January to March 1997

Re: Natural language marking in HTML

From: M.T. Carrasco Benitez <carrasco@innet.lu>
Date: Sat, 8 Mar 1997 10:20:04 +0100 (MET)
To: Misha Wolf <misha.wolf@reuters.com>
cc: www-international <www-international@w3.org>, Unicode Discussion <unicode@unicode.org>
Message-ID: <Pine.LNX.3.95.970308095113.25161B-100000@localhost>
> The value of the LANG attribute is *not* defined to be an ISO 639 code. 
> RFC 2070 uses the language tag scheme defined by RFC 1766.  ISO 639 is just 
> one element of this scheme.  All the following examples use more than ISO 639 
> and are legal RFC 1766 language tags:
>    zh-cn
>    no-nyn
>    en-cockney
>    x-klingon

You are right.  I will correct it.

> I do not agree with the proposition that the presence of <HTML LANG=...> 
> should be taken to mean that the document is monolingual.  For an example, 
> see <http://www.reuters.com/unicode/iuc10/x-utf8.html>.  This document is 
> far from monolingual - it contains the same text in twenty nine languages.
> As the document has an English title, a brief English introduction, a few 
> English images and ends with English trademark statements, we have used 
> <HTML LANG=en>, and have then tagged the elements containing the various 
> texts with the languages of those texts.  To us this indicates that the 
> individual texts are embedded within an English page, even though it is 
> not the case that "... the bulk of the document is in one language.".

There is a need to indicate monolingual docs. <HTML LANG=...> look like
the right place as the meaning is "if I do not indicate otherwise, the
text in this document is in language xx".  So, it should expect that the
bulk of the language be the one indicate in <HTML LANG...>.

For the document you mentioned, it would probably be better not to
indicate the language in the <HTML LANG...> and to mark the English like
the other languages as the doc is clearly multilingual.

Received on Saturday, 8 March 1997 04:14:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:40 UTC