RE: Encoding Standard (was: RE: Encoding API exceptions)

> The charsets registry is hopeless out of touch with reality. That has been pointed out on the relevant list, but the discussion went nowhere (as you know). 

That's unfortunate, however that's a good place for encoding work.

> 0x81 0x22 needs to become U+FFFD U+0022 as otherwise you'll expose resources to XSS. 

I don't follow this exactly...  Sure, you might end up with an unexpected ", breaking the HTML, however any XSS security checks should happen (perhaps again) AFTER conversion from code pages, as code page conversion can (as this example demonstrates) mangle the input data stream.  IMO fixing code pages may be a nice security robustness type thing, but anything that gets converted should be suspect and revalidated.

-Shawn

Received on Monday, 10 November 2014 18:46:36 UTC