W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

Re: Tidy problems

From: <html-tidy@war-of-the-worlds.org>
Date: Mon, 8 May 2000 11:30:41 -0500
Message-Id: <p04310100b53c8e8736be@[216.229.13.10]>
To: html-tidy@w3.org
Daniel Persson <danpe271@student.liu.se> wrote:

>3) A line like (which of course is not correct HTML, but anyway):
>  <b><font>bold</b><br>plain<br></font>
>  Gives the result:
>  <b><font>bold</font><br>plain<br></b>

Which appears to be a swapping of the end tags to match the order of the
start tags without consideration of the actual effects of the original
order.

>  Instead of interpreting it as Netscape, something like:
>  <font><b>bold</b><br>plain<br></font>

Which is a switching of the opening tags.  In practice, one can't just
blindly swap tags like Tidy does unless the pair you are swapping are
adjacent to each other.  Otherwise you clearly alter the effects of the
tags.

I'd expect:

   <b><font>bold</b><br>plain<br></font>

to change to:

   <b><font>bold</font></b><font><br>plain<br></font>

which introduces a new </font> where it should have been closed and a new
<font> to reopen it after the bold closes.  Then a second efficiency pass
could recognize the </font><font> sequence (with intervening tags/content
that aren't affected by font tags) and unify the fonts, changing it to:

   <font><b>bold</b><br>plain</font><br>

which reverses the nesting.  (<br> isn't affected by <font> unless it is
paired (producing a blank line, though it is also valid to compress
adjacent <br>s to one or possibly converting them to paragraphs), so it
slips out.)  Basically make sure the tags are closed in the proper order
(reopening as needed to preserve intent), and then reordering the lot to
find the best alternative markup.

All bets are off if inheriting styles are applied, but then in practice,
using style requires valid markup.  (Actually, for application of style,
Netscape treats all closing tags as equivalent without regard to matching
correct end tag with start tag.  Or at least 4.x does.  Haven't tried
6.0beta.)

>4) Unfinished tags, causes the next tag to be interpreted as text,
>   instead of as in netscape, correcting the tag. An example:
>   <img src="link"<br>
>   Gives the result:
>   <img src="link">br&gt
>   Instead of, as interpreted by Netscape:
>   <img src=link"><br>

Actually, wouldn't Netscape give you:

    <img src="link">

without any text following it?

>Some functionality that I would like to see in Tidy:
>
>* An "Ugly print" option. Skipping all linebreaks and blanks,
>  making the resulting file as compact as possible.

Idon'tthinkyoureallymeantskippingALLblanks,didyou?:-)

The most you can do is convert each newline to a space, then compress
consecutive spaces to single spaces, to avoid altering the output.
-- 
         ,=<#)-=#  <http://www.war-of-the-worlds.org/>
    ,_--//--_,
 _-~_-(####)-_~-_  "Did you see that Parkins boy's body in the tunnels?" "Just
(#>_--'~--~`--_<#)  the photos.  Worst thing I've ever seen; kid had no face."
Received on Monday, 8 May 2000 12:30:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT