- From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
- Date: Wed, 13 Feb 2002 13:22:49 +1300 (NZDT)
- To: bfowler@ewitness.co.uk, html-tidy@w3.org, ok@atlas.otago.ac.nz
bfowler@ewitness.co.uk (ewitness - Ben Fowler) quotes the example
<body>
<li>1st list item
<li>2nd list item
being mapped to
<body>
<ul>
<li>1st list item</li>
<li>2nd list item</li>
</ul>
>That rule being adopted, Tidy could never repair anything at all.
The docs give ten examples, including the <ul> container element
that I mentioned earlier
This is, on the contrary, an excellent example of Tidy making a guess
about the correction, a guess which cannot be relied on in general.
(A) It's true that <ul> will fit here, but so will <ol>.
The heuristic is "this page probably looked OK in some browser;
it would not have had numbers, so the author probably didn't want
numbers." But it's only a heuristic; if someone types HTML manually
and runs Tidy on it _before_ viewing it in a browser, it is quite
likely to be the wrong change.
It's _still_ a good starting point for manual completion of the repair
even if it _is_ wrong.
(B) Let's generalise the example a little bit:
<ul>
<li>One item
<li>Another item.
<!-- the </ul> was supposed to be here -->
This is supposed to be a new paragraph.
<p>And so is this.
Tidy will convert this to
<ul>
<li>One item</li>
<li>Another item.
<!-- the </ul> was supposed to be here -->
This is supposed to be a new paragraph.</li>
<p>And so is this.</p>
instead of to
<ul>
<li>One item</li>
<li>Another item.
<!-- the </ul> was supposed to be here --></li>
</ul>
This is supposed to be a new paragraph.
<p>And so is this.</p>
Any time that an element (such as <ul>) can be followed by material
that would be allowed inside it (possibly inside some nest of
descendant elements that have omissible end-tags, like <li>), it is
impossible for Tidy to be sure where to put the end-tags. Placing
them as far to the right as possible is a good rule, and the result
is a good starting-point for manual correction, but it is only a
heuristic and not only can go wrong, it does go wrong.
The <td nowrap>... example is very similar to <B>; a missing right
bracket, and something following the place where the right bracket
should have been that would have been legal before the right bracket.
There is no perfect rule for where to place the ">", but the rule that
has been proposed is quite as good as the rule for restoring missing
</ul> end-tags.
Received on Tuesday, 12 February 2002 19:22:54 UTC