Re: [encoding] iso-2022-jp encoder XSS risks (#15)

I withdraw my suggestion for encoding the byte 0x1A.
  
As for my other suggestion, I still believe modifying the HTML error mode is doable and will resolve my issues, because as far as I can tell:
  
- no encoder algorithm in the Encoding Standard, other than `iso-2022-jp`, will treat these characters as errors,
- the HTML error mode is peculiar (unique) to the Encoding Standard and documents that reference it, and
- the other error mode, "fatal", doesn't care about the code point returned in the error.

Also, step 9 returns an error on an unencodable character without resetting the encoding mode; in the HTML error mode, this could result in encoding HTML escapes in JIS0208 mode, rather than ASCII mode -- another potential XSS issue.

The change could be as follows (showing my suggested changes to the HTML error mode and the iso-2022-jp encoder):

> HTML
>    Prepend U+0026, U+0023, followed by the shortest sequence of ASCII digits representing _result_'s code point in base ten <u>(or 65533 if that code point is U+000E, U+000F, or U+001B)</u>, followed by U+003B to _input_. 
>
> [...]
>
> 13.2.2 iso-2022-jp encoder
>
> [...]
>
> **2a. If iso-2022-jp encoder state is ASCII and _code point_ is U+000E, U+000F, or U+001B, return _error_ with _code point_.**
> 4\. ... 
> &nbsp;&nbsp;&nbsp; **0. If _code point_ is U+000E, U+000F, or U+001B, return _error_ with _code point_.**
> 9\. If _pointer_ is null, **run these substeps:**
> &nbsp;&nbsp;&nbsp; **1. If iso-2022-jp encoder state is not ASCII, prepend code point to stream, set iso-2022-jp encoder state to ASCII, and return three bytes 0x1B 0x28 0x42.**
> &nbsp;&nbsp;&nbsp; **2. Return _error_ with _code point_.**
>


---
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/15#issuecomment-174733575

Received on Monday, 25 January 2016 23:47:01 UTC