Re: Tree construction: Coalescing text nodes

On 11/11/09 7:12 AM, Geoffrey Sneddon wrote:
> However, given an implementation like that of all three ports of
> html5lib (Python, PHP, Ruby), the following would equally lead to O(n^2)
> behaviour given the above conditions (i.e., a immutable string type like
> that of Python):
>
> a</x>a</x>a</x>a</x>a</x>a</x>a</x>a</x>a</x>a</x>a</x>a</x>a etc.

There seems to be an assumption here that a Text node has to store all 
its text as a single string in the underlying implementation language, 
right.  That doesn't seem like a hard requirement to me, quite apart 
from the other points you raised.

> In short, I think no change to the spec can actually fix the possibility
> of an O(n^2) algorithm being used in an implementation: implementations,
> as ever, have to take their own care to make sure they are not
> vulnerable to any possible DOS attack.

Absolutely.

> Furthermore, I don't believe this to be the only way to solve the
> problem in the table case: a better solution would be to just always
> concatenate text nodes by creating a list of strings and concatenating
> them when the next non-text non-table node is inserted, or when the
> table is moved off the stack of open elements if a table follows. This
> would solve the problem for all of the above issues, not just for the
> table foster-parenting issue.

Yep.

-Boris

Received on Wednesday, 11 November 2009 15:15:24 UTC