- From: Daniel R. Kegel <dank@alumni.cco.caltech.edu>
- Date: Sun, 06 Feb 1994 21:36:46 -0800
- To: ietf-charsets@INNOSOFT.COM, insoft-l@cis.vutbr.cz, ISO10646@jhuvm.hcf.jhu.edu
I recently asked Mr. Asmus Freytag, a Microsoft employee who has been active on INSOFT-L, about Microsoft's position on mixed Chinese/Japanese/Korean text and Unicode in Windows NT. My concern was that, since 16-bit Unicode dosn't encode language, Windows-NT can't properly display mixed CJK language text. Mr. Freytag pointed out that, although Microsoft is devoted to 16 bit Unicode for Windows NT, and will not switch to a 32 bit encoding, users can mix fonts in Rich Text Format documents to achieve proper display. An NT programmer at Caltech pointed out that fonts in NT can be tagged with language, so language can (at least potentially) be deduced from the font being used, and a font can be chosen that is appropriate for a language. I hope this will be the case in practise. This means that Windows-NT should be able to interoperate with the 32 bit option of ISO10646, with a little work; for example, a telnet client or newsreader could be written that always shows mixed C/J/K Han characters in the appropriate font for the language. The full text of Mr. Freytag's remarks follows, at his request. (I am still curious as to whether the 32 bit option of ISO10646 will start out as Unicode plus two bits to indicate language, e.g. plane 00 = Unicode, plane 01 = Chinese subset of Unicode Han, plane 02 = Korean subset of Unicode Han, plane 03 = Japanese subset of Unicode Han. I have not been able to join the ISO16046 mailing list yet.) - Dan Kegel (dank@alumni.caltech.edu) From dank From: dank (Daniel R. Kegel) Date: Sun, 30 Jan 1994 21:41:24 -0800 To: asmusf@microsoft.com Subject: Windows NT and Unicode Asmus, in response to the question on INSOFT-L: >| I have heard that while Unicode contains Kanji, it does so in a way >| that is not acceptable to the Japanese market, and hence was not >| approved by them in recent votes. Does this mean a product that >| supports Unicode alone wil not be as acceptable as a product that >| handles Japanese character sets using other encoding methods. you wrote: >[ If it can import and export user's documents in shift-jis, > it is just as good as shift-jis, so nobody should care that it's unicode. ] This is true as far as it goes, but the primary objection to Unicode seems to be that it doesn't provide for palatable display of mixed Korean, Chinese and Japanese text in the same document. The Japanese insist that different fonts be used for the different languages. Is Windows NT going to be able to handle this sort of mixed language document? And will it be able to do so with plain Unicode? I'm afraid that this isn't possible, and that something has to be done to extend Unicode to represent language. The Japanese hope to do this by using 32-bit Unicode, but since Windows NT has chosen 16-bit wchar_t, it won't be able to go this route. Does this seem like a real problem to you and to Microsoft? -Dan (dank@alumni.caltech.edu) From asmusf@microsoft.com From: Asmus Freytag <asmusf@microsoft.com> To: dank@alumni.cco.caltech.edu Date: Mon, 31 Jan 94 11:01:52 PST Subject: RE: Windows NT and Unicode No, this is NOT a real problem. We(MS or the vendors in Unicode) do not think that 'plain text' solutions need that level of typographical finesse. If you have application areas where you would like to use the 'correct' font use formatted text solutions, i.e. 'rich text' where you carry the font information separately. Unicode support (even in NT) is set up so that you can easily extend todays rich text technologies to use of many large Unicode encoded fonts, e.g. one for Korean, Chinese and Japanese each. You would then, just as you would select Times, Helv. etc. select the Japanese font for the appropriate sections in your document (actually not THE, but A, Japanese font, because at that level of finesse you would want to be particular about which font is used). A. From dank@alumni.cco.caltech.edu To: Asmus Freytag <asmusf@microsoft.com> Date: Mon, 31 Jan 1994 21:38:26 -0800 From: "Daniel R. Kegel" <dank@alumni.cco.caltech.edu> Mr. Freytag, thanks for your quick response. Do you mind if I summarize it to the net? Thanks, Dan From asmusf@microsoft.com From: Asmus Freytag <asmusf@microsoft.com> To: dank@alumni.cco.caltech.edu Date: Tue, 1 Feb 94 09:33:36 PST Yes. Please send my comments out verbatim, Thanks, A. From dank@alumni.cco.caltech.edu To: Asmus Freytag <asmusf@microsoft.com> Subject: Re: Windows NT and Unicode Date: Tue, 01 Feb 1994 07:16:12 -0800 From: "Daniel R. Kegel" <dank@alumni.cco.caltech.edu> Mr. Freytag, One more thing: the Internet community appears to be very interested in achieving what you call typographical finesse, but what Han users call basic readability. The only way this affects Windows-NT is that to convert RTF to the coming 32 bit version of the ISO version of Unicode (for instance, to send a document via 'plain text' FTP or Usenet News), interface software will have to read the RTF, look at the fonts used, and decide (for Han fonts) what language the font is for. Likewise, the software will have to look at incoming 32-bit 'unicode' and pick a font according to language for Han language text. Does Windows-NT provide language information about its Han fonts, i.e. can an RTF reader deduce the language of a Han font by asking the operating system? From dank From: dank (Daniel R. Kegel) Date: Sun, 6 Feb 1994 15:35:56 -0800 To: heathh@cco.caltech.edu Subject: Windows/NT, Unicode, and the Internet Hi Heath, Recently I've been really interested in how foreign text should be represented on the Internet (because I wanted to do the right thing in my various whois servers and clients), and joined the appropriate mailing list. The answer appears to be (more or less) to use Unicode with a special encoding that makes the lower 128 chars (standard ASCII) appear just as they do now, and escapes all other chars in an efficient manner such that the resulting strings look like normal 8 bit ASCII to dumb software like filesystems and communications software. The problem is in Asian languages, where it seems one needs to know which language is being used in order to select the right font (they are VERY picky about this over there in Han-land), and Unicode doesn't allow for this. An 18 bit extended Unicode may be coming soon to handle this, but Windows-NT uses plain old 16 bit Unicode. Microsoft plans to stick with 16 bit Unicode; people who want to use the right font in mixed chinese/japanese/korean documents can bloody well use RTF and select the right font themselves, is the official line. I'm trying to write a summary on the issue for the mailing list. My question to you, o Windows NT expert, is: can you deduce the language in use from the font? Is there any info in NT that associates language(s) with a font? That way, you could write an Internet news or mail client that converted to or from 18 bit unicode on the fly. Thanks for any info, Dan K. From foo@bar Subject: Re: Windows/NT, Unicode, and the Internet To: dank@alumni.cco.caltech.edu (Daniel R. Kegel) Date: Sun, 6 Feb 1994 20:49:15 -0800 (PST) Dan, > > I'm trying to write a summary on the issue for the mailing list. > My question to you, o Windows NT expert, is: can you deduce the language in > use from the font? > Is there any info in NT that associates language(s) with a font? > That way, you could write an Internet news or mail client that converted to > or from 18 bit unicode on the fly. Heh, Windows NT expert. I like that. Anyway: I take you are saying: - You can pick a font from the 18-bit unicode. - Once you pick that font, you want to know what language it is. - Better yet, based on the language the user composed the message in, you want to know what language the font maps to to generate 18-bit unicode. If I have that right, I may have an answer. Fonts are executables with resources, much like resource DLLs. Starting with NT, any resource can have a language (and sublanguage) ID Associated with it. The idea is to make it easy to have multi-lingual dialog boxes,etc., Look up EnumResourceLanguages() in api32wh.hlp. Anyway, if you have fonts that support this, you're in business. If not, I dunno. Let me know if I understood your question correctly. (as well as if the solution sounds reasonable.) Later, Heath From dank@alumni.cco.caltech.edu To: Asmus Freytag <asmusf@microsoft.com> cc: "Daniel R. Kegel" <dank@alumni.cco.caltech.edu> Subject: Re: Windows NT and Unicode Date: Sun, 06 Feb 1994 20:57:37 -0800 From: "Daniel R. Kegel" <dank@alumni.cco.caltech.edu> Mr. Freytag, a local Windows NT programmer has informed me that fonts (like all other resources in NT) can have languages associated with them. That should make it possible to map from RTF to Unicode extended with language ID's. ----- end ---- --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Sunday, 6 February 1994 21:37:40 UTC