Re: Trimming spaces and dropping empty paragraphs.

* Lee Passey wrote:
>Bjoern Hoehrmann wrote:
>
>> This might be considered a bug, Tidy should produce a canonical version
>> of the document (equal settings => equal result, no matter how often you
>> apply these rules) and here it doesn't. I vote for fixing it, your
>> example and the result after cleaning it two times render the same in
>> current browsers.
>
>After pursuing several false paths, I think I have come up with a very small
>change which will solve most, if not all, of these problems.  The apparent
>theory of operation is that if spaces in a text node are trimmed to the point
>where the node no longer contains any text, the node should be removed from
>the tree.  This removal occurs in the parser.c in the function
>TrimTrailingSpace().  However, the test of whether to remove the node is
>inside the conditional (last->end > last->start), so if the node enters the
>function already empty it will not be removed.  This situation can occur, for
>example, when you have an empty space, bracketed by inline tags such as <em>,
>inside a block, such as paragraphs, e.g:
>
><p><em> </em></p>
>
>In this case TrimInitialSpace() has incremented node->start before
>TrimTrailingSpace is called, so the node is now empty, but has not been
>removed.  When the resulting text is printed it appears as:
>
><p><em></em></p>
>
>the space has been removed, but the tags are intact.  Running this through
>tidy a second time causes the (now) empty paragraph to be removed.
>
>The simple fix to this is to split the conditional statement into a test for
>a text node and a test for content, and then placing the test for removing
>the node inside the first block but outside the second.  Here are the diffs
>to implement the fix:

Has this been implemented? Tidy doesn't show this behaivour any longer.
However, while trying to fix this bug, I encountered the fact, that in
parser.c TrimSpaces() calls are typically followed by a
TrimEmptyElement() call, but there are cases where this does not happen.
This seems to be a bug, it would be a good idea to implement this
TrimEmptyElement() call in TrimSpaces() and delete all separate calls.

Comments?

Regards.

Received on Friday, 12 April 2002 03:52:46 UTC