W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2010

[Bug 9989] Is the number of replacement characters supposed to be well-defined? If not this should be explicitly noted. If it is then more detail is required.

From: <bugzilla@jessica.w3.org>
Date: Thu, 22 Jul 2010 13:25:19 +0000
To: public-webapps@w3.org
Message-Id: <E1Obvm3-0005LS-N3@jessica.w3.org>

Simon Pieters <simonp@opera.com> changed:

           What    |Removed                     |Added
             Status|RESOLVED                    |REOPENED
         Resolution|NEEDSINFO                   |

--- Comment #2 from Simon Pieters <simonp@opera.com>  2010-07-22 13:25:19 ---
The spec says to replace bytes *or* sequences of bytes that are not valid utf-8
with U+FFFD. It is thus not well-defined how many U+FFFD are expected for any
given sequence of bytes that are not valid utf-8. It could be one or the same
amount of bytes that are not valid, or anything in between.

(The same applies to text/html parsing.)

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Thursday, 22 July 2010 13:25:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 18:13:10 UTC