- From: <bugzilla@jessica.w3.org>
- Date: Tue, 28 Sep 2010 07:29:38 +0000
- To: public-webapps@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=9989
Simon Pieters <simonp@opera.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|NEEDSINFO |
Ian 'Hixie' Hickson <ian@hixie.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |ASSIGNED
--- Comment #7 from Simon Pieters <simonp@opera.com> 2010-09-27 11:22:36 UTC ---
It seems the bugzilla monster ate my comment. Trying again:
First try, probably isn't quite right:
Numbers are bytes in hex. "Anything but ..." includes EOF.
Stray 80-BF:
FE-FF:
replace with one U+FFFD.
C0-C1 followed by 80-BF:
replace the 2-byte sequence with one U+FFFD.
C0-FD followed by anything but 80-BF:
replace the first byte with one U+FFFD and reprocess the second byte.
E0-FD followed by 80-BF followed by anything but 80-BF:
replace the first two bytes with one U+FFFD and reprocess the third byte.
F0-FD followed by two 80-BF followed by anything but 80-BF:
replace the first three bytes with one U+FFFD and reprocess the forth byte.
F0-F4 followed by three 80-BF that represent a code point above U+10FFFF:
replace all four bytes with one U+FFFD.
F5-FD followed by three 80-BF followed by anything but 80-BF:
replace the first four bytes with one U+FFFD and reprocess the fifth byte.
FC-FD followed by four 80-BF followed by anything but 80-BF:
replace the first five bytes with one U+FFFD and reprocess the sixth byte.
Overlong forms (e.g. F0 80 80 A0):
replace the whole byte sequence with one U+FFFD.
--- Comment #8 from Ian 'Hixie' Hickson <ian@hixie.ch> 2010-09-28 07:29:37 UTC ---
Any volunteers for a Web UTF-8 spec?
I guess I'll put this in the HTML spec's infrastructure section and then refer
to it from all the other specs of relevance.
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Tuesday, 28 September 2010 07:42:28 UTC