Re: [whatwg/encoding] Big5 encoding mishandles some trailing bytes, with possible XSS (#171) from Henri Sivonen on 2019-02-07 (public-webapps-github@w3.org from February 2019)

From: Henri Sivonen <notifications@github.com>
Date: Thu, 07 Feb 2019 05:30:09 -0800
To: whatwg/encoding <encoding@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/encoding/issues/171/461419043@github.com>

Some findings about `WideCharToMultiByte` with flags set to zero (i.e. "best fit" _not_ forbidden):

950/Big5: The U+FFFD cells that can be seen in the [Big5 visualization](https://encoding.spec.whatwg.org/big5.html) are filled with PUA code points. U+0080 maps to 0x80 and U+F8F8 (PUA) maps to 0xFF.

949/EUC-KR: U+0080 maps to 0x80. Additionally, the row immediately above Hanja and the row immediately below Hanja (starting from 0xA1 trail, i.e. really just the part of the row where the trail is part of the original EUC trail range) is filled with PUA code points. Since the trail is always non-ASCII, this doesn't pose a security risk.

932/Shift_JIS: There are 4 PUA code points that map to single non-ASCII byte each. We already knew about this. Since these are non-ASCII, they don't pose a security risk.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/171#issuecomment-461419043

Received on Thursday, 7 February 2019 13:30:30 UTC