- From: <bugzilla@jessica.w3.org>
- Date: Mon, 13 Sep 2010 18:42:24 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9663
Simon Pieters <simonp@opera.com> changed:
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|NEEDSINFO                   |
--- Comment #5 from Simon Pieters <simonp@opera.com>  2010-09-13 18:42:23 ---
First try, probably isn't quite right:
Numbers are bytes in hex. "Anything but ..." includes EOF.
Stray 80-BF:
FE-FF:
replace with one U+FFFD.
C0-C1 followed by 80-BF:
replace the 2-byte sequence with one U+FFFD.
C0-FD followed by anything but 80-BF:
replace the first byte with one U+FFFD and reprocess the second byte.
E0-FD followed by 80-BF followed by anything but 80-BF:
replace the first two bytes with one U+FFFD and reprocess the third byte.
F0-FD followed by two 80-BF followed by anything but 80-BF:
replace the first three bytes with one U+FFFD and reprocess the forth byte.
F0-F4 followed by three 80-BF that represent a code point above U+10FFFF:
replace all four bytes with one U+FFFD.
F5-FD followed by three 80-BF followed by anything but 80-BF:
replace the first four bytes with one U+FFFD and reprocess the fifth byte.
FC-FD followed by four 80-BF followed by anything but 80-BF:
replace the first five bytes with one U+FFFD and reprocess the sixth byte.
Overlong forms (e.g. F0 80 80 A0):
replace the whole byte sequence with one U+FFFD.
-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Monday, 13 September 2010 18:42:25 UTC