RE: CDATA Section nodes in DOM.

> This also gives a potential problem when you are performing XPath
> queries on your document,

No, it just means you need to build a think layer of processing over the
DOM to adapt it to XPath's needs. Existing XPath implementations have been
doing that successfully for quite some time.

An alternative is to preprocess your document, replacing CDATASection nodes
with Text nodes and calling normalize()... but I beleive you'll find that
doing this "on the fly" is more efficient, unless the preprocessing occurs
as a (nonstandard) parser feature.

The proposed DOM Level 3 "whole text" operations and XPath API will
simplify this somewhat. I don't remember offhand whether the Level 3
load/save operations have the option of discarding CDATASection boundaries;
if not, that might be worth considering.

> and it does not comply with the Infoset
> specification. (Yes, I know that the specification is quite fresh :-)

The Infoset does not forbid having the document carry additional
information, which is what the CDATASection does. (Nor, unfortunately, does
it forbid leaving out data; as currently written, the Infoset is a set of
suggestions saying "If you want to present this information, this is how we
think you ought to present it... but if you do something else that's fine
too."





______________________________________
Joe Kesselman  / IBM Research

Received on Monday, 29 October 2001 09:31:32 UTC