- From: M.T. Carrasco Benitez <carrasco@innet.lu>
- Date: Tue, 4 Mar 1997 19:04:23 +0100 (MET)
- To: WInter <www-international@w3.org>
I wrote a preliminary doc on "HTML language marking" (language label
before). Comments ?
It is also in HTML at
http://www.crpht.lu/~carrasco/winter/lama.html
Regards
Tomas
----------------------------------------------------------------------------
PRELIMINARY DOC - NOT A DRAFT
{NOT}INTERNET-DRAFT M.T. Carrasco Benitez
Category: Informational 3 March 1997
Expires 3 Sep. 1997
HTML language marking
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress".
To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Distribution of this document is unlimited. Please send comments to
the WInter mailing list at <www-internaional@w3.org>. Information
about the WInter mailing list, including subscription details are in
http://www.w3.org/pub/WWW/International/O-misc-mlists.html
This memo does not specify an Internet standard of any kind. It is
intended to be informational.
Abstract
This document discusses the marking of natural language in HTML
documents and its relation with HTTP. It should be read together with
Internationalization of the Hypertext Markup Language [I-HTML].
Table of Contents
{later}
Example
The general look would be:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML LANG=fr>
<HEAD>
<TITLE>Mon doc français</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html;
charset=iso-8859-1">
</HEAD>
<BODY>
Je suis un Berlinois.
</BODY>
</HTML>
Language(s) of a document
This is defined in a similar way to traditional publication on paper:
Monolingual document
When the bulk of the document is in one language.
Multilingual document
When the bulk of the document is not in one language. For
example, a bilingual French and English document.
Behaviour
<HTML LANG=xx> indicates the language for the whole document with an
ISO-639 two letter code. This is a declaration that the document is
monolingual.
The language indicated by the <HTML LANG ...> should be included in
Content-Language of the HTTP header. If a <META
HTTP-EQUIV="Content-Language" Content="yy, zz"> is also present in the
document, the Content-Language should contain the aggregation of the
language in <HTML LANG ...> and <META HTTP-EQUIV ...>. For example
(from the fragments above) the Content-Language should contain xx, yy,
zz.
Multilingual documents should not include the LANG attribute in the
HTML tag (<HTML LANG ...>.) Languages should be marked with the LANG
attribute using the appropiate tag; for example, <P LANG=fr>. The
languages indicated by the other tags should not be included in
Content-Language. If the author of the document considers that the
amount of text in other languages is significant, it should indicate
in the <META HTTP-EQUIV ...>.
<META HTTP-EQUIV ...> does not indicate the language of the document
or which portion of the document is in which language; this could be
indicated only by the LANG attribute. It is just an instruction to
include the language(s) codes in the HTTP header. The document should
include some portions of the language(s) indicated, but it is not an
error if no language indicated in <META HTTP-EQUIV ...> is present in
the document.
Maximization
Servers could use a data structure for maximization purpose so they do
not have to look each time inside the documents to parse the language.
This is considered to be part of the document management system and it
is not discussed in this document.
Support
All this is hot air if it is not supported. Hence, this section will
list vendors and document producers that support, or intend to
support, these recommendations.
Acknowledgments
Albert Lunde
Bert Bos
Christine Stark
François Yergeau
Gavin Nicol
Larry Masinter
Martin Bryan
Martin Dürst
{incomplet}
Bibliography
[HTTP-1.1] R.T. Fielding, H. Frystyk Nielsen, and T. Berners-Lee,
"Hypertext Transfer Protocol -- HTTP/1.1", RFC 2068,
http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-v11-spec-07.txt
[TCN-HTTP] K. Holtman, A. Mutz, "Transparent content negotiation in
HTTP", Work in Progress,
http://gewis.win.tue.nl/~koen/conneg/draft-holtman-http-negotiation-03
.html
[PEP] R. Khare, "HTTP/1.2 Extension Protocol (PEP)", Work in Progress,
http://www.w3.org/pub/WWW/TR/draft-ietf-http-pep-01.html
[HTML 2.0] T. Berners-Lee, D. Connolly, "HTML 2.0", RFC 1866,
http://www.ics.uci.edu/pub/ietf/html/rfc1866.txt
[HTML-3.2] D. Raggett, "HTML 3.2 Reference Specification", W3C
Recommendation, http://www.w3.org/pub/WWW/TR/REC-html32.html
[I-HTML] F. Yergeau, G. Nicol, G. Adams, M. Duerts,
"Internationalization of the Hypertext Markup Language", RFC 2070,
http://www.alis.com:8085/ietf/html/draft-ietf-html-i18n-05.en.html
[STYLE] B. Bos, D. Raggett, H. Lie, "HTML3 and Style Sheets" Work in
Progress, http://www.w3.org/pub/WWW/TR/WD-style
[CSS] H. Lie, B. Bos, "Cascading Style Sheets Level 1" W3C
Recommendation, http://www.w3.org/pub/WWW/TR/WD-css1
Author Address
Manuel Tomas CARRASCO BENITEZ
carrasco@innet.lu
http://www.crpht.lu/~carrasco/winter
Received on Tuesday, 4 March 1997 13:00:47 UTC