Re: [whatwg/encoding] Differences between tests and specification (#169)

> Those errors are cause by the fact that the jis0208 index contains 2 or 3 pointers for those codepoints. The shift_jis encoder uses the index shift_jis pointer algorithm to guard for the codepoints with pointer in the range 8272 to 8835 but most of those codepoint's pointers are not in that ranges, and the tests expect the use of the last pointer if there are several pointers for a given codepoint, but the algorithm in the spec doesn't specify that.

I just checked this example for Shift_JIS and what you wrote does not hold. Although I haven't checked other cases in Shift_JIS, I am afraid that you're mistaken in other cases as well. 

> 0x2116 should return 0xfa 0x59 but the test expects 0x87 0x82

There are two pointers of U+2116 in[ index_jis208](https://encoding.spec.whatwg.org/index-jis0208.txt), 1193 and 10741.  You wrote that the test by Richard takes the last pointer if there are multiple pointers for a given character. However, his test takes the first pointer per spec (1193) instead of the last pointer (10741).  1193 (the first pointer for U+2116) is converted to 0x87 0x82 following the remaining steps in the spec.  '0xfa 0x59' is the result of applying them to 10741 (the last pointer)  

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/169#issuecomment-455694811

Received on Friday, 18 January 2019 21:34:53 UTC