- From: <bruce.wallman@us.pwcglobal.com>
- Date: Fri, 12 Apr 2002 10:42:16 -0400
- To: www-international-request@w3.org
- Cc: www-international@w3.org
This is not a futures question, so I am only concerned with character sets currently represented by IE. Given that constraint, are there Unicode characters 32767 and below that will arrive at the server as #12345; and need a different translation? Obviously, anything 9999 and below would be here. Are there really any that I will see some in the #199999; range? Does anyone know enough about Korean to tell me what combination of characters to hit on the Korean keyboard to test values below 32768 or above 99999? Does anyone know of a table somewhere (it would be big) that shows the translation of HTTP number values to ideographs? Regards _____________ Hi Bruce, >Will what I am doing work generally for all complex DBCS ideographs? Is it >in any way 'Korean' dependent? Are there other complex DBCS patterns that I >have not seen that require a different algorithm (for example, will I see >some numbers that are 4 or 6 digits rather than 5 or will I see some >numbers for which I should not subtract 65536, etc.)? I guess the reason why subtracting 65536 works well on Korean is that the VB function expects that the input Wide char code point is signed short. If my guess is correct, the logic might work well for characters bigger than 32767 in Unicode. Of course, NCR could be represented by 4 or 6 digits, not only 5 digits, although most of Hangul and Asian ideograph characters are in 5 digits range. In addition to this, many characters newly defined out of Unicode BMP. For example, characters in added in the new Chinese standard called GB18030 have code points beyond 65536. These "beyond BMP" characters could be represented by NCR with 7/8 digits numbers, although the current MS IE might not generate such NCRs on submitting HTML form data set. (I think MS IE generates a pair of illegal NCRs based on Unicode high/low surrogate in this case.) -Yoshito Umaoka ---------------------------------------------------------------- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
Received on Friday, 12 April 2002 10:41:57 UTC