- From: Gavin Nicol <gtn@ebt.com>
- Date: Mon, 5 Dec 1994 21:31:01 -0500
- To: fielding@avron.ICS.UCI.EDU
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, html-wg@oclc.org
As Roy pointed out, if one wants to, one can negotiate for different characer encodings for HTML with something like the following from a client: Accept: text/html; charset=unicode_1_1_utf_7 However, very soon, we will be getting SGML aware browsers (and also browsers for other document formats). Now we could have a charset= on each of these different MIME types, but I think we need to get a single HTTP field allocated for this. In addition, the following are probably also needed. 1) Either UTF-7 or UTF-8, or both, strongly recommended by both the HTML and HTTP specs as the way to transmit multilingual documents. 2) A definition of "escape codes" to be used to indicate language and other such parameters to aid in display purposes. As I have said elsewhere, such tagging would probably happen automatically, and so not be visible to the end users. I think we should look upon thse as "enabling technology". They will not be immediately used (or at least not widely), but eventually, as Unicode systems (browsers in particular) become available, they will be increasingly important. On top of this foundation, we can then build 2 libraries of great utility: 1) A library for converting between various characer ancodings, and the tagged UTF. 2) A library for handling font display using Unicode. This is not exceptionally difficult. With these, multilingual browser become, while not trivial, at least not much more difficult than roman only ones.
Received on Monday, 5 December 1994 18:30:06 UTC