Inputting Unicode form Browser from Stephen Toner on 2000-08-17 (www-international@w3.org from July to September 2000)

From: Stephen Toner <Stephen.Toner@virtualaccess.com>
Date: Fri, 18 Aug 2000 08:13:37 +0900
To: www-international@w3.org
Message-Id: <4.2.0.58.J.20000818081323.02d80880@sh.w3.mag.keio.ac.jp>

Hi,
I have been trying to input characters from various languages into a form 
in my browser.  I want to then store this text as unicode in a database.  I 
have found that if a set the charset to a western language or if i leave it 
blank, that ordinary ascii characters are read in as ASCII and characters 
such as Japanese are converted to &#xxxxx; form.  Is this unicode?
When I set the charset to "UTF-8" characters are chaged into combinations 
of strange boxes and symbols.  I thought that this was maybe the unicode 
for multibyte characters simply being displayed as their single 
bytes.  However some of these then aren't displayed correctly on output.
I would appreciate any advice, as article that I have read seem to have 
contradictions in that some say that &#xxxx; is the unicde for that 
character and others say something else.  Also most articles seem to ignore 
the inputting aspect.

Thanks,
Steohen

Received on Thursday, 17 August 2000 23:28:55 UTC