- From: <bugzilla@jessica.w3.org>
- Date: Fri, 06 Mar 2015 19:18:51 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28141 --- Comment #2 from Jungshik Shin <jshin@chromium.org> --- Another piece of information: I was tightening Chromium's Big5's table and found that it has a lot of "holes" in the trail byte in the ASCII range. Below is what I found (all in hexadecimal). lead: trail byte holes in the ASCII range 87: 76 89: 42 44 45 4A 4B 8A: 42 63 75 8B: 54 8D: 41 9B: 61 9F: 4E A0: 54 57 5A 62 72 They're all in [a-zA-Z]. So, arguably, the XSS risk is lower than 'punctuation-mark-like characters' in the ASCII range. In case of EUC-KR (windows-949), the trail byte in the ASCII range is limited to [a-zA-Z]. So, without 'adding back to the stream' clause, we'd only eat up [a-zA-Z]. Unless we're sure that [a-zA-Z] is harmless when eaten up, we should keep 'adding back to the stream if the trail is [0, 7F]" clause (in case of ICU, perhaps the overall memory/perf impact of keeping the current spec is neutral to a small net-loss; haven't compared yet). Anyway, it occurred to me that we might think about this, too. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Friday, 6 March 2015 19:18:53 UTC