Re: RFC: White Space Handling In XML Parsing from Arkin on 1999-05-22 (www-dom@w3.org from April to June 1999)

From: Arkin <arkin@trendline.co.il>
Date: Fri, 21 May 1999 20:28:11 -0400
To: Paul Grosso <pgrosso@arbortext.com>
CC: www-dom@w3.org
Message-ID: <3745FA1B.3A671138@trendline.co.il>

> I still think your use of the word "format" to refer to the source
> document is confusing--even to yourself.  Because it's making you
> think that those spaces, in some sense, "don't count" because they
> are "only there for formatting" and "formatting" isn't really part of
> the document content.
> 
> You're wrong about that.  The input is the input, spaces in data content
> of a document have nothing to do with "formatting," and those spaces are
> really there.

Here's a simple test that I use to decide if something is "format" or
"meaningful" to a DOM application.

Exhibit A, XML document

  <book-list>
    <book>Moby Dick</book>
    <book>Bible</book>
  </book-list>

Exhibit B, DOM generation

  bookList = doc.createElement( "book-list" );
  book = doc.createElement( "book" );
  title = doc.createTextNode( "Moby Dick" );
  book.appendChild( title );
  bookList.appendChild( book );
  book = doc.createElement( "book" );
  title = doc.createTextNode( "Bible" );
  book.appendChild( title );
  bookList.appendChild( book );

Now, if Exhibit A and Exhibit B produce the exact same DOM document tree
than they are equivalent. Anything that is excessive in the document
tree is, for the purpose of the application, redundant. There is no
reason on why Exhibit A should produce a DOM structure that is different
than Exhibit B.

As for your solution, I don't see the DOM being changed that easily to
get around the problem, nor do I see developers rushing to learn the new
workarounds. But that's just personal opinion.

Arkin

Received on Saturday, 22 May 1999 09:17:01 UTC