W3C home > Mailing lists > Public > www-international@w3.org > April to June 2002

RE: HTTP arriving at the server

From: Thierry Sourbier <ml@i18nGurus.com>
Date: Fri, 12 Apr 2002 17:23:05 +0200
To: <www-international@w3.org>
> Given that constraint, are there Unicode
> characters 32767 and below that will arrive at the server as #12345; and
> need a different translation?

Yes some Korean character are using code points below 32767 (e.g. Hangul
Compatibility Jamo).

> Does anyone know of a table somewhere (it would be big) that
> shows the translation of HTTP number values to ideographs?

You are asking for the Unicode code charts :) You'll them at www.unicode.org
(note that hex values are more popular than decimal values).

> Does anyone know enough about Korean to tell me what combination of
> characters to hit on the Korean keyboard to test values below 32768 or
> above 99999?

I honestly know very little about Korean. Instead of relying on the IME to
enter Korean. you should do it via cut and paste by creating a small html
page with using the NCR of the characters you want to type (e.g.
"<html><body>&#xxxxx;&#xxxxx;&#xxxxx;</body></html>") or simply enter the
NCR by hand in your form on the server side that exactly as if you were
typing the Korean character.

You can also find sample page on the web and do some cut and paste e.g.

Also if Yoshito Umaoka guess is correct and that you API is indeed limited
to a short. You'll then be limited to value up to 65535. Characters above
that range are likely to be broken into a pair of surrogates characters.


www.i18nGurus.com - The Open Internationalization Resources Directory.
Received on Friday, 12 April 2002 11:24:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:46 UTC