Re: Tree construction: Coalescing text nodes

On Nov 13, 2009, at 11:34, Geoffrey Sneddon wrote:

>>> As reported in <http://www.w3.org/Bugs/Public/show_bug.cgi?id=8239>, it is actually impossible for html5lib to implement the spec in all cases, as it is possible to use it with tree models that have no concept of adjacent text nodes (such as ones that are widely used in Python like ElementTree).
>> The failure of putting adjacent text nodes in ElementTree probably isn't interop-sensitive even if text nodes in browsers that support scripting may be. On the flip side, the V.nu streaming SAX mode splits text nodes into arbitrary calls to the characters() callback (in accordance with the SAX API contract).
> 
> But is splitting arbitrarily even conforming?

Taking the position that SAX splitting text is non-conforming would be equally silly as taking the position that ElementTree must show non-coalesced text. Per the SAX API contract, it's up to the consumer to coalesce, so in that sense, SAX makes it impossible to communicate mandatory non-coalescing.

The issue of normative text node discontinuities is an interop-sensitive issue only for UAs that expose the DOM API to content-supplied JavaScript. It would be interesting to know what constraints in that area are actually hard Web compat constraints.

On Nov 13, 2009, at 12:06, Geoffrey Sneddon wrote:

> However, I think that such implementations are probably more important in terms of the structure of the DOM created (because they are more likely to support scripting), and as such it seems silly to have anything apart from a single text node in all cases, especially when such implementations can likely have a single text node backed by multiple strings internally.

It's not necessarily silly not to require browsers to coalesce in all cases. Would you make parser-inserted text nodes coalesce into script-created text nodes or parser-created older-than-previous text nodes that a script has moved around?

Part of the issue here is how much it costs in performance to always inspect the DOM for a pre-existing text node sibling (as opposed to the parser trying to coalesce only in cases where the parser itself knows it may have left an adjacent text node in the tree).

(I don't know if the perf cost matters.)

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 13 November 2009 11:58:54 UTC