- From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
- Date: Mon, 26 Jun 2000 17:39:22 +1200 (NZST)
- To: html-tidy@w3.org, html-tidy@war-of-the-worlds.org
I wrote about <CODE>...<CODE>...</CODE>...</CODE> gjames@pop.war-of-the-worlds.org (Unverified) wrote Otherwise, the nesting is presentationally meaningless and should be cleaned up. Well, no. A style sheet could perfectly well say that CODE inside CODE is formatted differently from CODE that is not inside CODE. It doesn't require a CLASS attribute to do this. code { color: red; white-space: pre; font-family: monospace; } code code { color: blue; white-space: pre; font-family: monospace; } That applies to things like <SAMP><SAMP> and <VAR><VAR> as well. They *were* presentationally redundant (not meaningless) before the introduction of CSS, but *aren't* now. Tidy's method of presuming that the second <CODE> should be </CODE> though is a very poor solution. IMO, the cleanup should wait until after all other parsing is done, and then the interior tags can be eliminated as a pair. Exactly. But even then, not without explicit permission. Tidy doesn't just make invalid markup valid. It also cleans up useless yet valid markup. I think this is very appropriate and addresses some types of bad markup still found in the wild, mostly empty tags created by so-called WYSIWYG editors. True. I see a lot of empty <P> elements, and <TABLE>s inside <FONT>s, and really unbelievable rubbish, from big-name products. You can argue that it's still perfectly valid, but if Tidy became paranoid and avoided making any changes to valid markup, next you'll want it to leave certain invalid markup alone because the browsers still do what you mean (already dealt with such a request), and eventually you'll end up with Tidy doing absolutely nothing to anything. Once that happens Tidy ends up being just a glorified cat. This is a "slippery slope" argument. Such arguments can be valid, but there is a huge difference between syntactically legal HTML and syntactically illegal argument, making the claim "next you'll want it to leave certain invalid markup alone" unwarranted. In this case, we are talking about a construction which is not just legal but *harmless*. Now HTML Tidy has this "feature" because there are HTML generators out there that get their brackets wrong; sometimes this correction is *essential*. I suppose that there is no disagreement that - a warning is appropriate for technically legal "lint" - this kind of stuff is sometimes a mistake so should SOMETIMES be fixed - HTML Tidy's configuration files are a Good Thing The question is - *when* should the correction be made - should it be made at all if the start- and end-tags are in fact correctly balanced? - what form should the correction take? (is there any disagreement that HTML Tidy's treatment of *this* example is undesirable?) - should the alteration of legal nested %fonts and %phrases be enabled by default? What you answer may well depend on what the HTML you most commonly have to clean up looks like. This particular example is automatically generated by someone else's program which I'd rather not have to try to fix, especially as it technically isn't broken.
Received on Monday, 26 June 2000 01:39:28 UTC