I want CData, but Dom gave me an entity! from Clay McCoy on 2001-03-16 (www-dom@w3.org from January to March 2001)

From: Clay McCoy <clay@swordmicro.com>
Date: Fri, 16 Mar 2001 10:57:55 -0600
To: www-dom@w3.org
Message-ID: <3AB245C2.A99FC87D@swordmicro.com>

I am writing software where I use the w3c dom parser on some xml and then put it
into some containter classes for ease of use in the program.  Then the program
can simply use these classes to manipulate the data.  Once the program is done I
reconstruct a Dom document from the container classes and write this out to a
new modified xml document.  I am new to this and if there is a more streamlined
approach then I would be excited to hear about it.
    Well, the program that I described works quite well except for a few
quirks.  The main one involves entities, specifically the '&'.  When the parser
would encounter one of these it would break.  To fix this, the program that
generates and sends me the xml added CData tags around every text field where
&'s might be encountered.  This may have not been the best solution and I would
love to hear of better ones.  This did fix the problem with the data coming in.
It is now parsed into the Dom, and from there into the container classes just
fine, even if a & is encoutnered.  But when I send the information back out,
from the containers to a dom, and from the dom to an xml document, the '&' is
represented by an "&amp;" tag.  It looks like the text is broken up into three
children in the dom.  The three children are the text before the &, the & itself
represented as some strange character string, and the text after the &.  Other
programs that look at this xml later expect only one child and don't see the
rest of the children.  Therefore they only see the text up to the &.  What
shoudl I do? Shoudl I write the code to search for extra children somehow and
compile them all into one line of text?  This seems liek a lot of work to do it
in every case, and I am not sure how to implement it.  Or is there some way to
specify that it is written out with CData tags around it?  This seems like it
woudl be easiest.
    Something else that I have noticed when lookign at the dom while debugging
the program is that ther are a lot of extra children that I have to skip over to
get to the data that I actually want.  What are those nodes there for, and what
is the best way to deal with them?

Received on Friday, 16 March 2001 11:49:04 UTC