- From: Thierry Sourbier <webmaster@i18ngurus.com>
- Date: Fri, 19 Oct 2001 15:29:24 +0200
- To: <www-international@w3.org>
Sourav, > The problem is setContentType works fine when the string you are > printing out is an Unicode string. Well, all strings are assumed to be Unicode in Java, that's a feature and the big difference with a byte stream. If you have non-Unicode strings that means that you didn't read them with the proper encoding and the problem would better be solved there. > Where as if content type is specified through meta tag what I found even > the non unicode string is displayed properly. I don't know how it works. Well it is a case where 2 mistakes compensate one another :). You are relying on the default encoding for both the input and output when your data obviously is using a different encoding. This works fine only as your default encoding is likely a single byte with no invalid values (e.g. CP1252). Yet be aware that you can't manipulate the string in your Java code as you may corrupt/lose the data because Java does not know anymore what is a character (e.g. just try to do a character count...). As you've already discovered it when you tried to use the setContentType API, your code will also quickly become a maintenance nightmare. In a multilingual environement, it is therefore best to specify the encoding used for any input/output even if in some case (like yours) it seems to work fine if you don't. Regards, Thierry
Received on Friday, 19 October 2001 09:55:16 UTC