- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Mon, 10 Nov 2014 19:40:15 +0100
- To: Shawn Steele <Shawn.Steele@microsoft.com>
- Cc: "www-international@w3.org" <www-international@w3.org>
On Mon, Nov 10, 2014 at 6:48 PM, Shawn Steele <Shawn.Steele@microsoft.com> wrote: > Ok, more bluntly, if someone notices a discrepancy between https://encoding.spec.whatwg.org/index-windows-1252.txt vs http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT, then what happens? You mean how the latter has UNDEFINED for various bytes whereas the former requires U+FFFD? If a format references the Encoding Standard, it's clear what needs to happen. If a format is vague about encodings, that would be a problem that needs to be fixed. > I'm not saying there is such a discrepancy, however if there isn't, then why not point to the version IANA points to in the charsets registry? If there were a discrepancy (other than I suppose undefined being marked as control), then which one gets the bug? The charsets registry is hopeless out of touch with reality. That has been pointed out on the relevant list, but the discussion went nowhere (as you know). The "standards" it points to meanwhile do not address issues implementations face. E.g. it is not defined that in shift_jis 0x81 0x22 needs to become U+FFFD U+0022 as otherwise you'll expose resources to XSS. Extensions to shift_jis or euc-kr are also not covered. -- https://annevankesteren.nl/
Received on Monday, 10 November 2014 18:40:42 UTC