Re: Special group of functions for NON-ELEMENT nodes in document from Ray Whitmer on 2006-09-27 (www-dom@w3.org from July to September 2006)

From: Ray Whitmer <ray@xmission.com>
Date: Wed, 27 Sep 2006 08:51:23 -0600 (MDT)
To: Matt Dockerty <news@nistrum.co.uk>
cc: Ray David Whitmer <ray@jhax.net>, www-dom@w3.org
Message-ID: <Pine.LNX.4.64.0609270822400.21081@xmission.xmission.com>
On Tue, 26 Sep 2006, Matt Dockerty wrote:

> On 26 Sep 2006, at 12:13, Ray David Whitmer wrote:
> Agreed. I think the slack has to be tightened because users can currently 
> (inadvertently or otherwise) write non-portable, yet valid, DOM programs.

It would be nice for some, but it is not likely to happen.

> normalizeDocument or normalize() seemed like the perfect answer but calling 
> it crashes Internet Explorer 6 immediately meaning it's use on real Web pages 
> is out, until IE6 is out of the picture anyway.

In case you wondered, crashing is not the documented behavior.  Until IE6 is
out of the picture, how would a different standard solve anything?  Perhaps
other vendors should emulate IE6 right down to the crashes just so we can
all get along?

> Taking something common and hierarchical like a treeview or a tiered menu 
> system and doing processing with DOM methods would be so much more useful 
> than any ID based approach. Currently, the only way to do this is by using a 
> suite of helper functions, much like old-school DHTML. I'm disappointed that, 
> in practice, a standards-based approach has brought no consistency to this 
> important area.

I thing you are over-eager to dismiss things.  IDs can be hierarchically
structured and accessed.  There may be times when it is impractical to place
IDs into a document, but it is not such a bad solution.

> They would curse a DOM which serialized differently from another DOM 
> certainly, which is the current situation.

So, as with the crashing normalizeDocument, you would rather have it produce
a document all on one line with no whitespace left in element content than
keep the whitespace around?  Sorry I do not agree with you, not that IE 6
even implements the serialization standard in the first place.

> Despite the fact that firstChild returning an empty whitespace node makes no 
> sense for browser application developers, I can see the use for this node in 
> pre-formatted node content, pretty-printers, IDEs etc. If there was only one 
> true way to implement the DOM, I would say 'with whitespace' would have to be 
> it, but that now leaves a legacy of browsers that don't work like that.

If firefox returned an empty node, it would be violating the spec, but an empty 
node contains no whitespace so how would it be called a whitespace node in the 
first place.  You have confused things.

>> There is really nothing to debate about the existing standard.  It is
>> what it is and the accessors see all nodes, not just elements.
>
> Where does it state that please? All I could find in the spec was a statement 
> that these nodes 'may' be retained. I think that a 'must' or a 'should' would 
> have prevented this inconsistency.

Yes clearing it up would have initially prevented this confusion.  But that
does nothing to help your argument about today's specification.  The methods
return Nodes, not Elements as was being suggested.  There has never been
anything in the specification requiring implementors to understand the DTD
or schema of the document it is parsing.  That is why XML was invented, so
that it could be processed without knowing a DTD.  XML is a major focus of 
the DOM.  Generating an HTML DOM requires a certain degree of greater knowledge
of certain HTML tags, but still has always been the target of tags that
vendors or users just threw in.  Knowing whether tags are element content or
mixed content is extremely difficult when there is not a governing shema
that is strictly enforced.  Thus, the only thing the specification could say
is 1.  Vendors are forbidden from discarding any whitepace, 2.  Vendors
may choose whether to discard whitespace, 3.  Vendors must have a good schema
in all cases before parsing HTML.

1. would have been a good choice,  but 2. is what the browsers' vendors were
willing to agree to.  If the vendors will not agree to a specification, it
doesn't matter how much you wish it were agreed to.

It is what it is, and tightening it in that way now would violate the trust 
of all who ried to be compatible with it.  You need further agreements 
between vendors.

Ray Whitmer
Received on Wednesday, 27 September 2006 14:51:42 UTC