W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2001

Re: Empty paragraphs

From: Lee Passey <lee@novonyx.com>
Date: Tue, 04 Sep 2001 09:13:30 -0600
Message-ID: <3B94EF9A.213114C2@novonyx.com>
To: html-tidy@w3.org


Lee Passey wrote:
> 
> (2) in TrimSpaces(), no check is made for text nodes which have trimmed
> into oblivion.
> 
> I presume newer versions of tidy should include these fixes, so I am
> including here the diffs from the 8-2000 version that I used to
> accomplish this.
> 

I made a slight mistake here; the 8-2000 version of tidy _does_ delete
text nodes which have been trimmed into oblivion _if_ they are not
attached to a td or th node.  (I presume there are good reasons for
maintaining zero length text nodes for those tags, although for the life
of me I can't figure out what it would be).

In any case, here are the revised diffs for parser.c which fix the
mis-trimmed &nbsp; entity.


289d288
<             /*!  NOTE:  &nbsp; is utf-8 encoded as two bytes  */
295,299d293
<                 if (   (unsigned char)(lexer->lexbuf[last->end - 1])
== 0xc2
<                     && c == 0xa0)
<                 {
<                     last->end -= 1;
<                 }
304,308d297
<                 if (   (unsigned char)(lexer->lexbuf[last->end - 1])
== 0xc2
<                     && c == 0xa0)
<                 {
<                     last->end -= 1;
<                 }
390d378
<     {
392,394d379
<         if (text->start == text->end)
<             TrimEmptyElement( lexer, text );
<     }
399d383
<     {
401d384
<     }
Received on Tuesday, 4 September 2001 11:11:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:46 GMT