[Bug 27256] revamp iso-2022-jp decoder/encoder

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27256

--- Comment #4 from Anne <annevk@annevk.nl> ---
It seems we do not want to follow the RFC exactly. I found numerous mismatches
between browsers and the RFC:

* Start with an ESC sequence is not an error in browsers.
* EOF in two-byte mode is not an error in browsers.
* EOF after ESC sequence is not an error in browsers.

Here is an outline of how I plan to rewrite this:

* Add Roman state
* Turn SI / SO / invalid ESC sequence (only replace ESC) into U+FFFD
* Invalid ESC sequence means switch to ASCII state
* ESC sequence after ESC sequence is invalid ESC sequence (triggers ASCII)

This is also based in part on great research from a duplicate bug:
http://upokecenter.dreamhosters.com/articles/2013/04/differences-in-the-iso-2022-jp-encoding-between-browsers/

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Thursday, 6 November 2014 17:40:18 UTC