Re: How should text be normalized for tests? from Fred L. Drake, Jr. on 2002-03-25 (www-dom-ts@w3.org from March 2002)

From: Fred L. Drake, Jr. <fdrake@acm.org>
Date: Mon, 25 Mar 2002 11:04:12 -0500
To: "Joseph Kesselman" <keshlam@us.ibm.com>
Cc: "Michael B. Allen" <mballen@erols.com>, www-dom-ts@w3.org
Message-ID: <15519.19068.851921.674395@grendel.zope.com>

Joseph Kesselman writes:
 > That doesn't comply with the expectations which the DOM places upon XML
 > processors. See the description of Text nodes:
...
 > Your choices would seem to be to either (a) accept that Expat is going to
 > fail compliance tests which are based on the above assertion (which
 > requires that your test results explain why this isn't a "first made
 > available" situation), or (b) write your test driver so it calls normalize
 > () before running the tests (ditto) or (c) fix Expat's DOM builder.

I must have missed the first message about this.  As Expat's
(somewhat) active maintainer, I feel compelled to chime in.

The basic Expat parser does not provide a DOM implementation at all,
so I think identifying problems with it is difficult.  Expat is much
closer to a SAX parser than anything else and should be thought of in
those terms.

There are several DOM implementations which have been built on top of
Expat for various languages; I've built one in Python, and have built
a builder for another (also in Python).  Both of these have provided
normallized text nodes.  The only reason I can think of which would
prevent this would be if the text nodes are exposed to client code
before being finished; in that case it makes more sense to start a new
node to avoid surprising the client code by modifying an
already-exposed node.  For synchronous-loading DOM implementations
(such as mine), this should not be an issue.

The Expat-based DOM-builders for Python (I know of three) are all open
source, so anyone writing an Expat-based DOM builder is free to look
at them for techniques if they like.  I'd also be willing to
correspond with anyone to help out if I can, though my availability to
write code would be minimal at best.

  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

Received on Monday, 25 March 2002 11:07:44 UTC