W3C home > Mailing lists > Public > public-html@w3.org > April 2008

Re: UTC Agenda Item: Recommendations for handling ill-formed sequences

From: John Cowan <cowan@ccil.org>
Date: Fri, 11 Apr 2008 19:12:36 -0400
To: UTC <unicore@unicode.org>, "utc-chair@unicode.org" <utc-chair@unicode.org>, public-html@w3.org
Message-ID: <20080411231236.GG28132@mercury.ccil.org>

�istein E. Andersen (quoted by Mark Davis) scripsit:

> One notable difference is that overlong sequences as well as UTF-8
> sequences representing surrogates and characters outside Unicode
> (>10FFFF) will typically map to several replacement characters according
> to your proposal, but to only one in Markus Kuhn's system

I agree that overlong sequences, surrogates, and old-10646 sequences
should become a single FFFD.

The first thing you learn in a lawin' family    John Cowan
is that there ain't no definite answers         cowan@ccil.org
to anything.  --Calpurnia in To Kill A Mockingbird
Received on Friday, 11 April 2008 23:13:10 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:32 UTC