W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

What about characters outside of those expressible in XML?

From: Tony Lavinio <alavinio@progress.com>
Date: Sun, 19 Feb 2012 17:39:09 -0500
Message-ID: <4F417A0D.901@progress.com>
To: "public-xml-er@w3.org" <public-xml-er@w3.org>
We deal quite a bit with XML created by the equivalent of old-fashioned "print" statements.

The two biggest problems we see come along are unescaped content -- primarily & vs. &amp;,
and characters that are not expressible in XML -- primarily nulls, but often control characters.

I can imagine several ways to deal with these. In the former case, turning a plain & into &amp;
is easy, but not so easy when the content looks like &&amp;.

In the latter case, should nulls be dropped? Turned into <?unicode \u0000?>? Would we need
to define different /types/ of fixup modes, depending on how the user wants errors to be
handled?

--
TONY LAVINIO
PROGRESS SOFTWARE CORPORATION
14 Oak Park  |   Bedford, MA 01730-1414  |  USA
WWW.PROGRESS.COM
Received on Sunday, 19 February 2012 22:39:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 19 February 2012 22:39:38 GMT