- From: Martin J Duerst <mduerst@ifi.unizh.ch>
- Date: Fri, 24 May 1996 14:20:18 +0200 (MET DST)
- To: boo@best.com (Walter Ian Kaye)
- Cc: www-font@w3.org
Walter Ian Kaye wrote: >At 4:56p -0400 05/23/96, Rob Migliore wrote: > >>We provide data in several languages such as english, polish, and >russian >>and install the necessary fonts on our clients' systems (running ns >>2.02). We would like to encode our documents of another language in a >>way that changes the font or the character set automatically without >>having to switch to options, etc. Actually, I believe that we would >want to change the character set. Can anyone comment on this? Definitely the "charset", which is a well used but very inappropriate name in MIME to denote encodings of characters. Please don't use fonts to switch between different encodings, it might work in some cases, but will give you big headaches in the future. >>Some of these languages that we would be supporting are not supported >>directly by ns2.02, they would have to be user defined. Is it still >>possible to automate the switching of the fonts/character sets? > >>I've seen the META tag around as follows - does it work? > >> <META HTTP-EQUIV="Content-Type" CONTENT="Text/Html; > Charset=iso2022-jp"> Yes, this is correct, and it works on Netscape (and a few other not so well known browsers). Ideally, this information is sent in the HTTP header, and not as part of the document, because there are "charset"s where it is impossible to understand that tag inside the document. >First off, there is a difference between "language" and "character set". > >There are charsets *associated* with languages, but that's as far as it >goes. > > >For example, Netscape supports "charset=iso-8859-2" and >"charset=x-mac-ce" for Central European languages. Note >that this covers more than one language. It may be nice help to the user that Netscape supports x-mac-ce, but please help everywhere to do what fortunately has worked for Western Europe, namely that only one single encoding (i.e. "charset") is used. Different platforms currently still use different local encodings, but for the web, it does not help to just use your local "charset", or the one you think might be most popular on target machines. The only good solution is to stick to very few widely usable sets. For Central Europe, this clearly would be iso-8859-2, as it is iso-8859-1 for Western Europe (you don't have to indicate this, as it is the default). Ideally, in the future, there will be even less "charset"s, if everybody is moving towards Unicode. >You have to determine whether the characters used in Polish are covered >by one of the supported charsets. They should be supported by both of them. But please use iso-8859-2 for wide compatibility. >Unfortunately, Netscape only supports the character sets it has >defined: > >"us-ascii", >"iso-8859-1", "x-mac-roman", "iso-8859-2", "x-mac-ce", > "iso-2022-jp","x-sjis", "x-euc-jp", > "euc-kr", "iso-2022-kr", > "gb2312", "gb_2312-80" > "x-euc-tw", "x-cns11643-1", "x-cns11643-2", "big5" > >There's no way to use any others and have Netscape take advantage of >it. So even if you found the name of a Russian character set (such as >iso-8859-5, >which I found in rfc1345), it wouldn't do you any good because Netscape >would just ignore it. The number of "charset"s that Netscape is supporting is increasing with every version, maybe more than necessary. But definitely, they should add iso-8859-5, and if you can convince them that you have a wide enough market, I guess they might even make a special version for you. While it is very difficult for an outsider to add an encoding, it is rather easy for them, especially if the underlying OS already supports it, which I assume is the case for you users. <<--wants to know the charset names for MS Windows and Unix... For the web, using "charset"s proprietary to some system is the wrong thing to do. Please use only widely accepted and standardized "charset"s, which means mainly the iso-8859-x series and some others. Regards, Martin.
Received on Friday, 24 May 1996 08:21:51 UTC