- From: Hans van Mourik <MOURIK@rullet.LeidenUniv.nl>
- Date: Tue, 23 Jul 1996 20:09:13 +0100 (MET)
- To: www-international@w3.org
Hello to you internationalisationisers, I would like to know how the HTML LANG-attribute should be linked up to a particular character-set. In fact what I'm looking for is an HTML-equivalent for the TEI ``writing system declarations''. Are there any thoughts about such a thing? I am aware that, er, as a newcomer to this list, I might actually raise a point which has already been discussed. Well -- sorry, you may flame me if you want to. What we (NHDA) would like to do is to serve documents containing *multiple languages*. We're not so much interested in serving a directory with multiple translations of the same instance. Consider a document containing both French, German and Russian. HTML 3.2 offers us the possibility of marking divisions, paragraphs, <span>'s and so on with lang="ru" | lang="fr" | lang="de". But then what? Now suppose the HTTP Charset-header is set to some Russian character- encoding (Ms. codepage 1251, KOI-8R or ISO 8859-5 -- you may pick your choice). What happens to entities like é an ö? Browsers like Navigator, Explorer and Mosaic will map them blindly to #233 and #246. And so they'll appear as arbitrary Russian characters. (How about Arena/Amaya -- I haven't checked that one). Do we have to publish it in Unicode instead then? -- ie. let most browsers just break and wait for the *perfect browser* to come along. I don't think so. I would say the LANG attribute is very appropriate (amongst others) to indicate a specific character mapping. (ie. "8-bits to Unicode") I may be wrong, but I haven't seen very much about this attribute lately. I thought it actually appeared in earlier versions of the CSS-draft. It doesn't any more. So, How about some IDREF-linking to make things work? ;;; Hans van Mourik ;;; mourik@rullet.leidenuniv.nl ;;; Netherlands Historical Data Archive ;;; PO Box 9515 ;;; 2300 RA Leiden, The Netherlands ;;; (+31)-70-5272719
Received on Tuesday, 23 July 1996 16:00:17 UTC