RE: Problem processing Shift-JIS

At 12:42 PM -0700 9/14/01, Rick Cameron wrote:
>I've just submitted this problem at the SourceForge site.

>I believe the code in tidy.c (revision 1.35) that reads Shift-JIS characters
>has a problem. At line 1040 there appears to be the assumption that any byte
>value greater than 127 is a lead byte (i.e. the first byte of a two-byte
>character). I believe this is true of Big5 - but it is certainly not true of
>Shift-JIS. In the diagram at
>http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnintl/html
>/S24CF.asp?frame=true you can see that the values from 0xa1 through 0xdf
>represent singe-byte characters.

Thanks for the feedback - the Big5 and Shift_JIS encoding support is
preliminary, is based on earlier code by Rick Jelliffe, and is by default
turned off in the current source (I'm not sure if Charlie turned it on for
the beta builds he release recently). I am currently looking at another bug
I discovered in the process of testing the Big5/Shift_JIS code changes. I'm
not the expert in this area, and I welcome your continued feedback.

Regards, Terry

Received on Saturday, 15 September 2001 02:59:43 UTC