Re: Encoding Standard (was: RE: Encoding API exceptions) from Anne van Kesteren on 2014-11-10 (www-international@w3.org from October to December 2014)

From: Anne van Kesteren <annevk@annevk.nl>
Date: Mon, 10 Nov 2014 20:09:13 +0100
To: Shawn Steele <Shawn.Steele@microsoft.com>
Cc: "www-international@w3.org" <www-international@w3.org>
Message-ID: <CADnb78gFSVfWhj7ongnvXE3dYAS-Sts77K4B2tN0WkX5thoUNw@mail.gmail.com>

On Mon, Nov 10, 2014 at 7:46 PM, Shawn Steele
<Shawn.Steele@microsoft.com> wrote:
> That's unfortunate, however that's a good place for encoding work.

In my experience, not at all. WHATWG and the i18n WG have been much
more receptive and helpful in addressing the problems I sought out to
solve. And I communicated that set of problems to all three audiences
equally.

>> 0x81 0x22 needs to become U+FFFD U+0022 as otherwise you'll expose resources to XSS.
>
> I don't follow this exactly...  Sure, you might end up with an unexpected ", breaking the HTML, however any XSS security checks should happen (perhaps again) AFTER conversion from code pages, as code page conversion can (as this example demonstrates) mangle the input data stream.  IMO fixing code pages may be a nice security robustness type thing, but anything that gets converted should be suspect and revalidated.

Should happen, sure. They don't however, and that browsers with
decoders that do not conform to the Encoding Standard put their users
at XSS risk. Or a similar kind of injection attack. E.g. if you can
control a field of some JSON through a URL you could load the JSON
through a <script> and get some of the user's data out by executing a
function of sorts due to JSON being decoded in a different way from
how the server expected it to.

-- 
https://annevankesteren.nl/

Received on Monday, 10 November 2014 19:09:41 UTC