Re: [whatwg/encoding] Shift_JIS decoder (#270)

> First, byte 0x5c (which is ASCII) must be changed to U+00a5; same as 0x7e to U+203E which seems to be missing from the spec. Both characters are marked as "Modified ASCII character" at https://en.wikipedia.org/wiki/Shift_JIS.

For consistency with the behavior of Windows code page 932, ASCII stays as ASCII. Windows' bundled fonts have corresponding intentional glyph misassigments in the Japanese fonts. Other fonts may not have these.

> But my main issue is with the bytes sequence 0x81 0x7C which according to https://encoding.spec.whatwg.org/index-jis0208.txt can be both decoded at either u+2211 or u+FF0C.

For the Shift_JIS sequence, 0x81 0x7C and the EUC-JP sequence 0xA1 0xDD, the spec rather clearly says U+FF0D. How did you arrive at either U+2211 or U+FF0C?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/270#issuecomment-895005338

Received on Monday, 9 August 2021 07:17:56 UTC