Re: "Empty" Text Nodes from Arkin on 1999-03-01 (www-dom@w3.org from January to March 1999)

From: Arkin <arkin@trendline.co.il>
Date: Mon, 01 Mar 1999 15:07:23 -0500
To: David Brownell <db@Eng.Sun.COM>
CC: www-dom@w3.org
Message-ID: <36DAF37B.BD3F0CDA@trendline.co.il>
David Brownell wrote:
> > > The XML spec states that "An XML processor must always pass all
> > > characters in a document that are not markup through to the
> > > application".
> > >
> > > My question: on which level the DOM is aligned?
> > > Is it attached to a XML parser, or is it more
> > > attached to a XML application?
> 
> I'd say it's clear that as written, DOM is attached to the application
> rather than to the "XML Processor" (not parser!) level.

That's perfectly clear. The DOM does not attempt to cover the behavior
of processors. But we're talking about real life situation where you
either use some parser, any parser (e.g. SAX API), or get the document
from some third-party library. You expect some common behavior, so you
can write it once, run it everywhere.


> > I think this is one point where the XML/DOM specification totally blew
> > it off. The above specification works well if your application is
> > responsible for constructing the document tree from a parsed document.
> 
> Presumably you mean the "DOM Level 1 Core" API spec, not a combination
> of the XML and DOM specifications.

No, I ment the DOM Level 1 Core in combination with the XML Version 1.0
specification. When I write code I make sure it conforms to both
specifications, and I assume others do as well. The DOM specs relies on
understanding and acquiantance with the XML specs. You can use the DOM
alone if all you do is manage in-memory tree, but once you get into
input/output and human readable files, you have to resort to the XML
specs. We are, after all, talking about input/output.


> The features the SAX 1.0 API exposes are almost identical to those
> which an "XML Processor" must support, as defined in the XML 1.0
> specification.  (It doesn't identify any external general entities which
> were ignored, however -- assuming perhaps that they all were read.)

A typical application will use the SAX API to invoke a parser and
retrieve a full blown document tree, not define its own DOM document
builder. Therefore, a typical application will not get the opportunity
to accept/reject whitespaces. Therefore, a typical application might
break when used with a different SAX parser than the one it was
developed with.


> The problem has been noted.  It matters critically for almost all
> languages except JavaScript, where one can't really implement a DOM
> oneself, and your execution environment (e.g. web browser) will hand
> one to you.

It would be beneficial if the DOM API is extended to cover the SAX API,
which seems to be commonly accepted and used. That would certainly clear
up many issues. It would also be beneficial if the DOM API could
recommend acceptable default behavior, which might be different in a
browser and server-side environments, but consistent across all
browsers.


> > I think the W3C has the obligation to strictly clarify this point in the
> > specification, so that: a) all parsers behave consistently,
> 
> Make that "All DOM implementations behave consistently" and I'd be
> far more inclined to agree!  It gets back to that testing/conformance
> issue:  the DOM spec isn't testable without making certain assumptions
> which are not supportable within scope of the spec.

Agreed.

> 
> - Dave
Received on Monday, 1 March 1999 15:13:29 UTC