W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

HTML Tidy bug (general) + Mac Tidy (BBTidy) suggestion

From: Stanton McCandlish <mech@eff.org>
Date: Sat, 29 Apr 2000 23:57:26 -0400 (EDT)
Message-Id: <v04220809b53127c6d9db@[204.253.162.21]>
To: html-tidy@w3.org, teague@mailandnews.com
I'm using the BBTidy version (MacOS, BBEdit plug-in).

Running it on this snippet of code, from a larger page:


******** [BEGIN SNIPPET] *********

<BR>

   <!-- _-_-_-_-_-_-_-_ BOTTOM JUNK AREA _-_-_-_-_-_-_-_  -->

<SPAN CLASS="SM">
   <P ALIGN="LEFT"><SPAN CLASS="subhead-grn">EFF's Hot Topics:</SPAN><BR>
		<A HREF="/pub/Censorship">Censorship &amp; Free
Expression</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A
HREF="/pub/Censorship/Ratings_filters_labelling">Content
filtering</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A
HREF="/pub/Intellectual_property/Video/">DVDs</A>&nbsp;&nbsp;-&nbsp;&nbs
p;
		<A
HREF="/pub/Privacy/Crypto">Encryption</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A HREF="/pub/Privacy/Surveillance">Digital
Surveillance</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A HREF="/pub/Privacy/Medical">Medical
Privacy</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A HREF="/pub/Intellectual_property">Online Copyright
&amp; Fair Use</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A HREF="/pub/Infrastructure/DNS_control">DNS &amp;
human rights</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A HREF="/pub/Spam_cybersquatting_abuse/Spam">Junk
e-mail (spam)</A>&nbsp;&nbsp;-&nbsp;&nbsp;
		<A
HREF="/pub/Spam_cybersquatting_abuse/Cybersquatting">Cybersquatting</A><
/P>
   <P ALIGN="LEFT"><SPAN CLASS="subhead-grn">EFF's Affiliations and
Coalitions:</SPAN><BR>
    <A HREF="http://www.gilc.org">Global Internet Liberty Campaign
(GILC)</A><BR>
    <A HREF="http://www.ifea.net">Internet Free Expression Alliance
(IFEA)</A><BR>
    <A HREF="http://www.dfc.org">Digital Future Coalition (DFC)</A><BR>
    <A HREF="http://www.freeexpression.org">Free Expression Network
(FEN)</A><BR>
    <A HREF="http://www.truste.org">TRUSTe Privacy Policy Certification
Program</A>
</P>
   </SPAN>


<!-- _-_-_-_-_-_-_-_ END BOTTOM JUNK AREA _-_-_-_-_-_-_-_  -->

<BR>


******** [END SNIPPET] *********


you get these errors:


******** [BEGIN ERRORS] *********

BBTidy (vers 13th January 2000) Parsing "index.html"
line 290 column 1 - Warning: inserting missing 'title' element
 [the above isn't relevant here - the real full doc. does have TITLE]
line 295 column 5 - Warning: missing </span> before <p>
line 295 column 5 - Warning: trimming empty <span>
line 313 column 4 - Warning: discarding unexpected </span>

******** [END ERRORS] *********


I've managed to generate similar errors with other text+markup in the form:

<SPAN CLASS="x"><P><SPAN CLASS="y">text</SPAN></P></SPAN>

(That very example yields the same errors. Tidy doesn't realize that
the first span is enclosing the <P>...</P>, and doesn't recognize that
the second span has any attributes at all.)

I suppose one can argue that <SPAN>...</SPAN> is only intended to be
used within block elements and never surrounding them, though I don't
see that specified in the HTML 4.0.1 spec anywhere (and I'd have to
argue against it if it is in there, since the element is widely used,
and widely interpreted by agents, as having no such restriction, not to
mention that <DIV>...</DIV> typically has undesirable spacing effects
(linebreak before and after as if it were a <H#>...</H#>), leaving no
real option but <SPAN>...</SPAN> for applying CLASS information w/o
otherwise altering appearances, in many cases.

All that aside, Tidy's failure to notice attributes on the second span
does seem to be a bug. :)

Despite that, it's a way-nifty program, and helped me resolve a number
of minor errors, and do <SCRIPT...>...</SCRIPT> definitions better
(using TYPE instead of LANGUAGE; I'd never noticed that before).  So,
thanks. :)

PS: Regarding BBedit implementation: It'd be great if BBTiny were
clever enough to realize that it is operating on a selection rather
than the whole document, and not try to add missing <TITLE>s or other
stuff.  Otherwise the ability to load the changes directly into the
original document isn't very useful.


--
Stanton McCandlish      mech@eff.org       http://www.eff.org/~mech
Communications Coordinator & Webmaster, Electronic Frontier Foundation
voice: +1 415 436 9333 x105   fax: +1 415 436 9993   ICQ: 16631335
PGPfone: 204.253.162.21  ICQ Pager: http://wwp.mirabilis.com/16631335#pager
Received on Sunday, 30 April 2000 10:43:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT