Repairing incorrect tag minimisation was Re: Tags lacking a terminating '>' are spotted

At 1:24 pm +1300 7/2/02, Richard A. O'Keefe wrote:
>"Fred Bone" <Fred.Bone@dial.pipex.com> wrote:
>	OK, you tell me where to put the > in this:
>
>	<td nowrap is a deprecated attribute for table data cell elements ...
>
>A simple rule works reasonably well:
>     Start with <tag
>     while the next tokens are
>	<word> =   (any word)
>     OR  <word>     (only if word is a known single-enumeration-value
>		    attribute name)
>	process next attribute
>     If /> process empty tag, else
>     if > process start tag, else
>     insert > and process start tag.
>
>The example would become
>
>	<td nowrap> is a deprecated attribute for table data cell elements ...
>		  ^
>I _think_ the complete list of known single-enumeration-value attributes is
>     checked, compact, declare, defer, disabled, ismap, multiple, nohref,
>     noresize, noshade, nowrap, readonly, selected.
>It's not a long list.  If someone wants to add words to this list for the
>sake of their own extra tags, they should declare them explicitly.
>
>Consider the following example:
>     <html><body>
>     <table>
>     <tr>
>     <td nowrap  is a deprecated attribute for table data cell elements
>     <tr>
>     <td compact is not deprecated at all
>     </table>
>     </body></html>
>In Netscape 4.7, that displays as a blank page.
>Adopting this little rule, we get
>     <html><body>
>     <table>
>     <tr>
>     <td nowrap>  is a deprecated attribute for table data cell elements
>     <tr>
>     <td compact> is not deprecated at all
>     </table>
>     </body></html>
>which is admittedly missing two words of intended text from the display,
>but that's a *huge* improvement over missing _everything_!

I can't find anything to snip.

Both Mozilla and iCAB display the same as your browser does.

(Of course Tidy should close <td> and <tr> tags).

I have an objection to the process that generates <td compact>
from the input you gave.

IMHO, Tidy should adopt a "first do no harm" rule. In
other words, it should not attempt a partial repair of
elements that it does not fully understand. In my view,
because it is impossible to determine whether the author
intended

	<td nowrap="nowrap">

or

	nowrap is a deprecated attribute for table data cell elements

Tidy should do nothing.

I appreciate that there is a case for changing to <td>, and
expecting the author to re-check the page in an HTML editor
or browser, but I personally would not do it.

The best that Tidy could do is log the problem. (Or perhaps
add a comment).

Ben.

Received on Thursday, 7 February 2002 08:10:31 UTC