W3C home > Mailing lists > Public > www-international@w3.org > January to March 1997

HTML language marking

From: M.T. Carrasco Benitez <carrasco@innet.lu>
Date: Tue, 4 Mar 1997 19:04:23 +0100 (MET)
To: WInter <www-international@w3.org>
Message-ID: <Pine.LNX.3.95.970304185959.20078A-100000@localhost>
I wrote a preliminary doc on "HTML language marking" (language label
before). Comments ?

It is also in HTML at




                       PRELIMINARY DOC - NOT A DRAFT
   {NOT}INTERNET-DRAFT M.T. Carrasco Benitez
   Category: Informational 3 March 1997
   Expires 3 Sep. 1997
                           HTML language marking
Status of this Memo

   This document is an Internet-Draft. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.
   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference material
   or to cite them other than as "work in progress".
   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).
   Distribution of this document is unlimited. Please send comments to
   the WInter mailing list at <www-internaional@w3.org>. Information
   about the WInter mailing list, including subscription details are in
   This memo does not specify an Internet standard of any kind. It is
   intended to be informational.

   This document discusses the marking of natural language in HTML
   documents and its relation with HTTP. It should be read together with
   Internationalization of the Hypertext Markup Language [I-HTML].
Table of Contents


   The general look would be:
   <HTML LANG=fr>
   <TITLE>Mon doc français</TITLE>
   <META HTTP-EQUIV="Content-Type" Content="text/html;
   Je suis un Berlinois.
Language(s) of a document

   This is defined in a similar way to traditional publication on paper:
     Monolingual document
          When the bulk of the document is in one language.
     Multilingual document
          When the bulk of the document is not in one language. For
          example, a bilingual French and English document.

   <HTML LANG=xx> indicates the language for the whole document with an
   ISO-639 two letter code. This is a declaration that the document is
   The language indicated by the <HTML LANG ...> should be included in
   Content-Language of the HTTP header. If a <META
   HTTP-EQUIV="Content-Language" Content="yy, zz"> is also present in the
   document, the Content-Language should contain the aggregation of the
   language in <HTML LANG ...> and <META HTTP-EQUIV ...>. For example
   (from the fragments above) the Content-Language should contain xx, yy,
   Multilingual documents should not include the LANG attribute in the
   HTML tag (<HTML LANG ...>.) Languages should be marked with the LANG
   attribute using the appropiate tag; for example, <P LANG=fr>. The
   languages indicated by the other tags should not be included in
   Content-Language. If the author of the document considers that the
   amount of text in other languages is significant, it should indicate
   in the <META HTTP-EQUIV ...>.
   <META HTTP-EQUIV ...> does not indicate the language of the document
   or which portion of the document is in which language; this could be
   indicated only by the LANG attribute. It is just an instruction to
   include the language(s) codes in the HTTP header. The document should
   include some portions of the language(s) indicated, but it is not an
   error if no language indicated in <META HTTP-EQUIV ...> is present in
   the document.

   Servers could use a data structure for maximization purpose so they do
   not have to look each time inside the documents to parse the language.
   This is considered to be part of the document management system and it
   is not discussed in this document.

   All this is hot air if it is not supported. Hence, this section will
   list vendors and document producers that support, or intend to
   support, these recommendations.

Albert Lunde
Bert Bos
Christine Stark
François Yergeau
Gavin Nicol
Larry Masinter
Martin Bryan
Martin Dürst


   [HTTP-1.1] R.T. Fielding, H. Frystyk Nielsen, and T. Berners-Lee,
   "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068,
   [TCN-HTTP] K. Holtman, A. Mutz, "Transparent content negotiation in
   HTTP", Work in Progress,
   [PEP] R. Khare, "HTTP/1.2 Extension Protocol (PEP)", Work in Progress,
   [HTML 2.0] T. Berners-Lee, D. Connolly, "HTML 2.0", RFC 1866,
   [HTML-3.2] D. Raggett, "HTML 3.2 Reference Specification", W3C
   Recommendation, http://www.w3.org/pub/WWW/TR/REC-html32.html
   [I-HTML] F. Yergeau, G. Nicol, G. Adams, M. Duerts,
   "Internationalization of the Hypertext Markup Language", RFC 2070,
   [STYLE] B. Bos, D. Raggett, H. Lie, "HTML3 and Style Sheets" Work in
   Progress, http://www.w3.org/pub/WWW/TR/WD-style
   [CSS] H. Lie, B. Bos, "Cascading Style Sheets Level 1" W3C
   Recommendation, http://www.w3.org/pub/WWW/TR/WD-css1
Author Address

Received on Tuesday, 4 March 1997 13:00:47 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:16 UTC