- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Tue, 15 Apr 1997 22:37:14 +1000
- To: Gavin Nicol <gtn@eps.inso.com>
- CC: w3c-sgml-wg@w3.org
(If this is off topic, please forgive me) Gavin Nicol wrote: Some Unix systems define wchar_t to be 32 bits. BYTE magazine in March said that some UNIXs use 8bits for wchar_t (surely this is wrong!) Anyway, the fact that other standards making bodies may have failed to adequately define what they mean by a wide character surely should make us want to be more precise. As a side issue, but still on the subject of the desirability of Unicode, here is part of a posting today to a mail group on Asian documents: ------------------------- Christian Wittern writes: You might be aware of this, but it is completely new to me. I recently had a look at Word 97. I played with (late beta Versions) of the English and Chinese programs on English, Japanese and Chinese Windows 95 as well as German Win NT 4.0. The internal file format of Word 97 is now in Unicode. There seems to be no difference in Fileformats between Western and East-Asian versions. Now, finally it is not longer necessary to keep copies of the program in two, three or four different languages and operating systems, just to accomodate the need of processing multiple Asian languages. It is now possible to create a file, say, on Japanese Windows in English Word97 and open it in Chinese Windows and Chinese Word 97 and the Characters will display correctly without the need of any conversion! (Since the standard font names are different, there might be a need to set up the font mapping for the Japanese to the Chinese fonts - this is done once and forever). Although the file format is Unicode, the CJK glyphs are still seen through the limitations of the national encodings. This means that Word 97 running on Chinese Windows will only display Characters from Big5, Japanese Word only those from JIS; Characters outside of these ranges are displayed as a 'missing glyph' question mark, but are not in any way distorted, deleted or mixed up. This is still true even if the font used contains all the 20000+ CJK glyphs from the Unicode standard, like for example Bitstreams Cyberbit (free download from http://www.bitstream.com). This situation only changes when Word 97 is running on NT: With the proper font installed, it will happily display all CJK glyphs thus finally, after *years* of pain, making it possible to mix Asian and European text, with Sanskrit and whatever in one document. To some extend, this is even possible in English WIndows 95, where it is possible to install the fonts from the Internet Explorer Language Pack (www.microsoft.com and all over the planet, search for ie3lpktw.exe, ie3lpkcn.exe and so forth). Even the English version of Word97, running in a CJK environment, now is smart enough to register a switch of the Keyboard from alphabetical to ideographic input and will adjust the font accordingly. It seems the day finally has come where we can switch our textprocessing to Unicode and forget about things like Big5, JIS, KSC and the like to concentrate on the work we originally planned to do with the help of a computer. ---- someone then quibbled about input methods ... ----- Christian Wittern writes: Well, maybe I did not express myself very good. What I wanted to say is, due to some "feature" in Fareast Windows 95, Asian Fonts can only be installed with one Asian language flag, Japanese OR Chinese (Big5) OR .. That is even a font like Bitstreams Cyberbit, that contains *all * 20000+ CJK ideographs, will have to be installed as Japanese or Taiwanese. This will cause the OS and Word97 to filter out those characters that do not belong to the specified language. So although the encoding is unified to CJK, you still need extra fonts for the East Asian regions and some areas, like JIS 212 are still not available. Of course all this applies to Win 95 in the different Fareast versions, Win NT 40 will happily display all the Kanji you might ask for. -------- (B.t.w, by "all the kanji you can ask for" , Christian means "all the kanji that are in Unicode". He has been working on a project at a Kyoto university with over 48 000 kanji, so he is well aware of the need for ISO 10646 to be extended! ) Rick Jelliffe
Received on Tuesday, 15 April 1997 08:47:32 UTC