Re: Comments and questions on text node interface from Ray Whitmer on 2002-03-14 (www-dom@w3.org from January to March 2002)

From: Ray Whitmer <rayw@netscape.com>
Date: Thu, 14 Mar 2002 08:39:05 -0500
To: Jonas Sicking <sicking@bigfoot.com>
CC: www-dom@w3.org
Message-ID: <3C90A7F9.3080501@netscape.com>
Jonas Sicking wrote:

>>>I have to ask what is the intended use for replaceWholeText? I can
>>>
>defenetly
>
>>>see the use getting WholeText since it fits better with XPath. But
>>>
>chainging
>
>>>it feels very awkward since it could morph big parts of tree. I think a
>>>better way of editing a tree using an "XPath/infoset model" would be to
>>>
>have
>
>>>users normalize the tree first and then use the normal interfaces.
>>>
>>>/ Jonas Sicking
>>>
>>Implementations which do not support entity / entity reference
>>constructs could clearly normalize such that there is no need for
>>wholeText / replaceWholeText, but entities and entity references have
>>not been deprecated.
>>
>
>Sorry, for a moment there i thought that normalize would expand entity
>references. So i guess that my answer is that we need a normalize-like
>method that expands entity references as well as merging textnodes.
>
I think this makes an assumption that it is OK to force the user to 
expand them -- that it is not a valid model to preserve the entity 
references while doing XPath processing.  That is an easy statement for 
someone using a DOM which doesn't put them in in the first place, but if 
they are there for a reason, it is not obvious why they are suddenly not 
needed just because the user wanted to use XPath.  If they are really 
not needed, then the answer would be deprecation, and save everyone lots 
of trouble.

>>Are you asking to deprecate EntityReference nodes
>>(those with an expanded hierarchy beneath them -- unexpanded entities
>>must still be supported for infoset compliance, not to mention correct
>>display of XHTML which requires special display of unexpanded entity
>>references)?
>>
>
>Not at all, all I'm asking for is some way to expand them, both the ones
>containg textnodes and the ones containg other node-types. (Well.. I do
>think that it would be nice if entityreferences were depecated, but that's
>an entierly different discussion.)
>
>But this brings up an interesting question: what happens with wholeText and
>XPath when there are unexpanded entityreferences in a tree? Should they
>consider text on either side as separate "text-sections", or should the
>entity be ignored and the text "merged"?
>
In real use cases, forcing the user to expand them before he uses XPath 
seems very much like deprecating them.  It assumes there is no need for 
the user to keep the entity reference nodes in the tree during and after 
XPath processing.

The specification says quite specifically, I believe, what happens with 
wholeText and replaceWholeText.  It does the right thing and merges 
across entity reference boundaries.

>>Or would you declare that XPath just doesn't work on DOM
>>implementations which use them?  That might not be a fair treatment
>>without officially deprecating them.  It would seem like harsh treatment
>>after the fact to say to implementations that chose to support them that
>>XPath functionality depends upon them not supporting them -- without at
>>least an official deprecation.
>>
>>Also, should users really be required to normalize before using the
>>XPath, or is it reasonable to use XPath on an unnormalized tree?
>> Normalization may not be a trivial requirement if XPath is being used
>>repeatedly as part of a process of mutating a document.  Most built-in
>>functions of DOM do not requires the user to normalize before calling.
>>
>
>XPath never changes the tree so it has no need for a replaceWholeText
>method. Similary I'm not objecting to the wholeText property since that is
>needed for clients using XPath.
>
>The reason that I don't like replaceWholeText is that it basically changes
>underlying data through a view. Adding views to "abstrahize" data into a
>more convinient form is fine, however using that view to change the
>underlying data often leads to complex implementations and non-obvious or
>un-expected behaviour.
>
There are lots of things that XPath does not do because, unfortunately, 
the XPath specification has divorced itself as much as possible from DOM 
and document mutation.  I can assure you, however, that significant use 
cases exist for XPath in DOM finding parts of the document, including 
text, so that it can be modified, updated, deleted, etc.  Were that not 
the case, there would be no need for a snapshot or an invalidating 
iterator -- just make what happens under mutation implementation 
specific since no one should ber doing it -- but the presumption is 
false.  We spent time on it because DOMis very concerned about enabling 
applications to modify the document and users will want to do that.

For example, find all occurrances of the word "foo" in emphasized text 
and change them to "bar".  Without both wholeText and replaceWholeText, 
it would be quite difficult to do a simple thing like that, and arguably 
XPath doesn't help much because of all the other processing and manual 
navigation you have to do anyway.  But with it,  I request all the 
logical text pieces that are emphasized, look for occurances of "foo", 
if there are any substitute "bar" and I don't have to worry about what 
happens if the word "foo" is half in one text node and half in the other 
or if some part of the word is part of an entity reference.  There are 
certain cases where replacement is impossible  (raises 
NO_MODIFICATION_ALLOWED error) but those are all automatically detected.

Ray Whitmer
rayw@netscape.com
Received on Thursday, 14 March 2002 11:39:32 UTC