- From: <bugzilla@jessica.w3.org>
- Date: Mon, 13 Sep 2010 18:42:24 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9663 Simon Pieters <simonp@opera.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|NEEDSINFO | --- Comment #5 from Simon Pieters <simonp@opera.com> 2010-09-13 18:42:23 --- First try, probably isn't quite right: Numbers are bytes in hex. "Anything but ..." includes EOF. Stray 80-BF: FE-FF: replace with one U+FFFD. C0-C1 followed by 80-BF: replace the 2-byte sequence with one U+FFFD. C0-FD followed by anything but 80-BF: replace the first byte with one U+FFFD and reprocess the second byte. E0-FD followed by 80-BF followed by anything but 80-BF: replace the first two bytes with one U+FFFD and reprocess the third byte. F0-FD followed by two 80-BF followed by anything but 80-BF: replace the first three bytes with one U+FFFD and reprocess the forth byte. F0-F4 followed by three 80-BF that represent a code point above U+10FFFF: replace all four bytes with one U+FFFD. F5-FD followed by three 80-BF followed by anything but 80-BF: replace the first four bytes with one U+FFFD and reprocess the fifth byte. FC-FD followed by four 80-BF followed by anything but 80-BF: replace the first five bytes with one U+FFFD and reprocess the sixth byte. Overlong forms (e.g. F0 80 80 A0): replace the whole byte sequence with one U+FFFD. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Monday, 13 September 2010 18:42:25 UTC