Re: 12. Are C1 controls and Unicode non-characters disallowed?

On Tue, Sep 11, 2012 at 4:28 AM, John Cowan <cowan@mercury.ccil.org> wrote:

>
> We can say "U+FFFE and U+FFFF are banned in documents because that's
> what XML says."
>
> Or we can say "Unicode non-characters are banned in documents."


I think these both correspond to reasonable positions.

The first position amounts to saying that we think XML made a mistake in
disallowing U+FFFE and U+FFFF; it should have allowed all code points
except surrogates.  Therefore we shouldn't make the situation any worse by
increasing the number of disallowed code points. MicroXML disallows U+FFFE
and U+FFFF only because XML did.

The second position says that XML was right to disallow U+FFFE and U+FFFF.
 Since XML was designed, Unicode has added characters that are in the same
category as U+FFFE and U+FFFF. We should now fix MicroXML so that it is
consistent with the current version of Unicode (and all future versions
because of the Unicode stability policy).

James

Received on Tuesday, 11 September 2012 01:29:26 UTC