- From: Allan Chau <achau@rsasecurity.com>
- Date: Wed, 07 Mar 2001 12:48:06 -0800
- To: www-international@w3.org
- Message-ID: <3AA69E86.49E2CC5A@rsasecurity.com>
We've got an English-language only product which makes use of single-byte character strings throughout the code. For our next release, we'd like to internationalize it (Unicode) & be able to store data in UTF8 format (a requirement for data exchange). We're considering between using UTF8 within the code vs. changing our code to use wide characters. I'm wondering what experiences others have had that can help with our decision. I'm thinking that using UTF8 internally may mean less rewriting initially, but we'd have to check carefully for code that make assumptions about character boundaries. Because of this, I think that it'd be more complicated for developers to have to work with UTF8 in code. Unicode (UTF16) internally would be easier to manage since most characters will essentially be fixed width, but there'd be alot of code to rewrite. Also, I've heard of problems with the wide character type (wchar_t) having different definitions depending on platform (we're running on NT & Sun Solaris). Many of our product APIs would also be affected. Can others offer their insights, suggestions? Thanks, -allan
Received on Wednesday, 7 March 2001 15:48:10 UTC