- From: <toby_phipps@peoplesoft.com>
- Date: Fri, 21 Jul 2000 04:02:04 -0700
- To: nitin_goel@yahoo.com
- cc: www-international@w3.org
Hi Nitin, We do just this across Solaris, HP-UX, Sequent, Compaq Tru64 and AIX. You really can't depend on the portability of wchar_t types, and will need to use your own type if you want a consistent character representation across all platforms. Problem here is that you'll then need your own implementations of the standard C string functions and anything else that you expect to accept Unicode data. You'll also need stubs for any system calls that expect character arguments, which either map your real Unicode type to the OS's wchar_t implementation or converts them back to a the OS's non-Unicode char type before calling the real function. We solved the problem by licensing a portable Unicode library (Rosette from Basis Technology), and writing a "compatibility" library of 100 or so common string and system functions implemented via Rosette instead of via the standard C runtime library. We also needed our own 16-bit type we defined as a unsigned short, and called WCHAR. One of the nice features of Rosette is that it came with a large set of pre-writted C runtime library string functions implemented with their code we could use as a base. One other problem is string constants in your code. If you don't use the operating system's wchar_t implementation your L"string" quoted literals won't match your Unicode type. We ended up writing a preprocessor that expanded out L"string" into Unicode characters represented as \x<<byte1>> <<byte2>>. This was much more difficult than originally expected given the myriad of ways quoted strings can be used (at variable initialization, assignments etc.), but it was possible. This pre-processor writes out .i files which were then passed to the real C++ preprocessor/compiler. Good luck - it's a big job. Things would be much easier for cross-platform Unicode implementations if the C standards defined a common wchar_t type. Toby. -- Toby Phipps PeopleSoft, Inc. tphipps@peoplesoft.com +1-925-694-9525 "Nitin Goel" <nitin_goel@yahoo.com> To: www-international@w3.org Sent by: cc: www-international-requ Subject: Portability of Unicode code !? est@w3.org 07/20/2000 06:43 PM Hi everybody, This is a desperate appeal for all you souls to help me out with a problem I face. I have a unicode server which handles database files. Now I assumed that unicode is 16bit data (is that too bad an assumption !?) Anyway, it so happens that while On NT and AIX wchar_t does translate to a 16 bit value, things are very much different on SunOs and HPUX !! There wchar_t is defined as long and int respectively ! And now I am stuck with a lot of code and database files. Has anybody faced this problem before ? Is there any input someone can give me regarding porting unicode enabled code ? Can I work around this by geting a third party unicode library from somewhere and linking my code to it rather than the system libraries on these platforms ? Any help/input on this would be extremely valuable. Thank you, Nitin PS> Please mail me the responses as I am not subscribed to any of these mailing lists. __________________________________________________ Do You Yahoo!? Get Yahoo! Mail Free email you can access from anywhere! http://mail.yahoo.com/
Received on Friday, 21 July 2000 07:02:21 UTC