- From: Martin Duerst <duerst@w3.org>
- Date: Fri, 07 Sep 2001 08:14:00 +0900
- To: <vinod@filemaker.com>, "Lenny Turetsky" <LTuretsky@salesforce.com>, "W3intl (E-mail)" <www-international@w3.org>
At 10:40 01/09/06 -0700, Vinod Balakrishnan wrote: > >For all these, it's not too difficult. Shift-JIS uses bytes in > >the 0x80-0x9F range, and has specific patterns. If there are > >only very few characters outside us-ascii, it may not work, > >but with more non-us-ascii characters, the probability > >of success is going up very quickly. > >Shift-JIS represent the trailing bytes of double byte in 0x40-0xF0 range ( >only the leading byte is in high ASCII range ) . Also the Hankaku (single >byte) Kana is represented as single byte in the high ASCII range. A Japanese >text in Shift-JIS contains single byte kana and Kanji characters can be >misinterpreted as Latin-1 Yes. Shift_JIS can have bytes in the 0x80-0x9F range, but Latin-1 doesn't. If there is such a byte, Shift_JIS cannot be misinterpreted as Latin-1. It may be misinterpreted as windows-1252, but that's a different story. Regards, Martin.
Received on Thursday, 6 September 2001 21:50:36 UTC