What about characters outside of those expressible in XML?

We deal quite a bit with XML created by the equivalent of old-fashioned "print" statements.

The two biggest problems we see come along are unescaped content -- primarily & vs. &,
and characters that are not expressible in XML -- primarily nulls, but often control characters.

I can imagine several ways to deal with these. In the former case, turning a plain & into &
is easy, but not so easy when the content looks like &&.

In the latter case, should nulls be dropped? Turned into <?unicode \u0000?>? Would we need
to define different /types/ of fixup modes, depending on how the user wants errors to be
handled?

--
TONY LAVINIO
PROGRESS SOFTWARE CORPORATION
14 Oak Park  |   Bedford, MA 01730-1414  |  USA
WWW.PROGRESS.COM

Received on Sunday, 19 February 2012 22:39:37 UTC