W3C home > Mailing lists > Public > www-dom@w3.org > July to September 2001


From: Joseph Kesselman <keshlam@us.ibm.com>
Date: Fri, 27 Jul 2001 09:14:43 -0400
To: "Julian F. Reschke" <julian.reschke@greenbytes.de>
Cc: <www-dom@w3.org>
Message-ID: <OF33C80292.678C6D01-ON85256A96.0047B387@pok.ibm.com>

>I fear lots of serializers will just ignore this problem, producing
>non-well-formed XML.

That's a quality-of-implementation issue. Serializers have to examine every
character anyway, because some will need escaping (&amp;, or characters not
directly supported in the target encoding). Adding a legal-character check
here is less expensive -- but admittedly less diagnostically useful -- than
doing so on the input side of the equation.

If you need a robust serializer, either demand robustness from the authors
of the one you're using or investigate others. If you know from first
principles that your DOM will contain only legal XML text, you may prefer
to sacrifice some robustness in exchange for greater performance. Which is
better is up to you; the DOM leaves you free to select either option.

When the DOM defines its own serializer API -- also coming in DOM Level 3
-- we'll have to make a decision on what we check when and how we report
anything we don't like.  That will _probably_ lean toward the robustness
side of the equation, since we're developing a basic utility. Of course
you'll still be able to use stand-alone serializers, so if you don't like
our choice of trade-offs you'll be able to pick another.

Joe Kesselman  / IBM Research
Received on Friday, 27 July 2001 09:15:21 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 20 October 2015 10:46:08 UTC