- From: Martin J Duerst <mduerst@ifi.unizh.ch>
- Date: Wed, 24 Jul 1996 11:16:21 +0200 (MET DST)
- To: Albert-Lunde@nwu.edu (Albert Lunde)
- Cc: MOURIK@rullet.LeidenUniv.nl, www-international@w3.org
Albert Lunde wrote: >At 8:09 PM 7/23/96, Hans van Mourik wrote: >>Hello to you internationalisationisers, >> >>I would like to know how the HTML LANG-attribute should be linked >>up to a particular character-set. In fact what I'm looking for is an >>HTML-equivalent for the TEI ``writing system declarations''. >>Are there any thoughts about such a thing? > >It is my impression that the intention of the various HTML and HTTP drafts >that have addressed this is that "language" and character encoding (a.k.a. >MIME charset) are, so to speak, "independent variables". In the general >case, neither determines the other. There are different HTTP headers for >charset and language. Exactly. In addition, LANG can be changed inside the document to make multilingual documents possible. MIME "charset" is the same for the whole document, without any problems. >It's been a while since I read the TEI documents. > >Taking a look at them it, appears that the "writing system declaration" >specifies: >(1) the language >(2) the writing system (script, alphabet, syllabary) used to write the langage >(3) the coded character set, entity names, or transliteration scheme used >to represent the graphic characters of the writing system. > >There is stuff defined in HTML and HTTP specs that addresses (1) and (3) >independently, but not much is said about (2) or the combination of the >three together. > >Perhaps someone wiser than me about the TEI can say more. Not that I am wiser about TEI, but I understand now a little bit about "writing system declaration". (2) is not a problem at all, it is very evident from the ISO 10646 characters in which scritp they are. As for (3), TEI has a much different range of applications than HTML. For example, transliterations may be very important because of the possibility of including texts that have already been input with some transliteration method. In the context of the WWW, transliteration was also discussed, but mainly in the context of helping a user that knows the language but not the script. This is largely a client-side issue. As for character encoding in general, HTML relies on HTTP and MIME in this respect, which is very reasonable for an internet application, whereas TEI had to or wanted to define its own stuff. Regards, Martin.
Received on Wednesday, 24 July 1996 05:16:33 UTC