- From: Matthew Brealey <thelawnet@yahoo.com>
- Date: Fri, 11 Feb 2000 02:38:55 -0800 (PST)
- To: www-style <www-style@w3.org>
[Sorry if I've sent this twice - my messages haven't been showing up, I suspect because of some sort of spam filter (my outgoing and incoming addresses are different), so I'm sending it by Yahoo! Mail] ONE: Tidy destroys the following example when clean: yes: <ol> <div class=example> This has a class that will give this element have a border; older browsers will at least get the indentation. </div> </ol> To: <div class="c1"> This has a class that will give this element have a border; older browsers will at least get the indentation. </div> When clean: no, this is the (better) result: <div style="margin-left: 2em"> <div class="example"> This has a class that will give this element have a border; older browsers will at least get the indentation. </div> </div> However, even this is unsatisfactory, because the author that tidies their page in Tidy will be surprised to find their page destroyed in Internet Explorer 3.0 (though not 3.01, 3.02 or 3.03), as used by approximately 3% of the WWW. For this browser treats ems as pixels (as indeed do all 3.0? versions), which is not of itself catastrophic because it will just not get indentation, but what is a real problem is that it interprets all margins as relative to the left of the canvas. Thus: Default margin -------------|Normal Body will be rendered here --|Two pixels in - where tidied OL and UL elements will be placed This problem is made even worse when the author has specified a larger-than-normal margin on BODY because the effect is made that much more dramatic. The worst of it is that those who deliberately shun CSS (and with very good reason - who wouldn't sacrifice the advantages of maintainability at the altar of pages that actually work - if your pages crash, are made unreadable or don't work due to the abominable support for CSS in browsers, you lose business; if you don't take advantage of the secondary advantages that CSS provides of maintainability, you only lose a few minutes of your time) and who abuse the OL and UL elements will be the most affected because they are less likely to have valid <div style="margin-left: whatever"> (or a class) for indentation. I would like to see an option to leave the (invalid) OL and UL elements as they are. TWO: Another thing that need fixing is that Tidy doesn't check that STYLE or LINK rel="stylesheet" declares the type attribute. THREE: It is also arguable that it should also give a warning if META http-equiv="Content-Style-Type" content="text/css" is missing when the style attribute is used; however, one author in a million might actually have it as a true HTTP header, so for them such a warning would be erroneous. FOUR: Furthermore, something that really annoys me is that when I accidentally omit a closing tag, Tidy thinks that I didn't know that the element couldn't span block elements and therefore inserts literally hundreds of (for example) <b>s and </b>s. Thus, where I have <p> Some <b>bold text - accidentally missed off the '</b>' </p>, it thinks that I wanted the boldness to span the whole of the rest of the document, and as a result I have to manually remove the hundreds of tags it adds (on one particularly extreme occasion it converted one of my (very complex and difficult to recreate) test pages into a whole load of <PRE> elements, and because of restrictions on the elements that can occur within <PRE> applied my style declarations on <BR> thereby destroying several hours of work). I believe this 'feature' will occur with every block and inline element except for P, DIV, OL, UL, LI, DL, DD and DT. I do wish Tidy wouldn't think that its users are idiots and don't know that inline elements can't span block ones and that pre can't enclose other block elements, and would simply just add the missing </b>, but not embolden the rest of the document, or trigger an error and terminate (or even an option idiot - boolean - idiot: yes treats the user like an idiot, idiot: no simply adds the closing tag in its correct location). FIVE: As a matter of principle I abhor PRE, since it frequently results in nasty scroll bars, which really annoys viewers of pages. However, it can be tiresome when I have a code listing to have to put a <br> after every line and put in non-breaking spaces for indentation. As a result, what I would like is for Tidy to do the task that bores me so: the conversion of PRE into a P element with <br> and to replace the spaces and carriage returns in the PRE. The option to do this should, IMO, be set via a convert-pre option in Tidy's config file. This should take the values yes, no or <identifier>, where <identifier> is a HTML class, as in <p class="identifier">. This option is important, since I like my code to have less leading than my P elements, and so therefore the class is useful for doing things like P.code {line-height: 1.2} to reduce my normal P {line-height: 1.5}. If the convert-pre option were simply to be made a boolean variable, then yes should result in <p class="pre">. SIX: Further to this on the subject of <pre>, Tidy destroys the CSS white-space: pre declaration. This is because it does not recognise that in: <P style="white-space: pre"> Some pre formatted text </p> the white space is meaningful. To cope with this, I would suggest that if an ID or class begins with 'pre', then it should be treated as preformatted; e.g., <P id="precode"> should be treated as preformatted (and no, one shouldn't use the PRE attribute for preformatted text, because it gives no indication as to the content of the element - BLOCKCODE and BLOCKSAMP would be good, PRE = bad). SEVEN: Action is a required attribute for FORM; Tidy does not point this out. EIGHT: Tidy breaks the STYLE element - it converts non-ASCII (or non-whatever encoding the document is expressed as being in) characters such as á to character references, but they aren't valid there. Correct would be to express them as a (hexadecimal) CSS escape; e.g., \E1 in this case. However, since no browser (except the nascent Mozilla 5) supports them, there is rarely a need to convert them since the CSS nmchar macro allows {nonascii} where {nonascii} is [\200-\377] (Octal codes; the 377 should be read as 4177777 (Octal) (because of Flex limitations)). NINE: Similarly ID - it converts characters like á to character references, but as ID is based on the NAME token, character references are not valid there. TEN: It would be useful for Tidy to check the validity of class and ID tokens; for example, many people use invalid classes such as class="1invalididentifier" or "-invalid" - these will (correctly) be ignored by many browsers. Similarly for ID - check that it is a valid SGML NAME token. ELEVEN: Why doesn't clean: yes remove BASEFONT? TWELVE: Tidy's conversion of FONT size="n" is inappropriate. For example, <font size=1> is mapped to font-size: 70%. This is inappropriate because font-size: 70% depends on the font-size of the ancestor, which Tidy does not know, whereas <font size=1> is the same wherever it is used in a document. Similarly <font size="+n"> - this is relative to the current <font size>, and not to the CSS font-size; for example, <div style="font-size: 100000px"><font size="+2"> means 2 more than the current <font size> - the font-size declaration has no effect. Perhaps most seriously, some <p><font size></p>s are mapped to H2, H3, H1. This is inappropriate because it is not clean layout - it will wreck any nice heading order (i.e., going from h1 to h2 to h3, etc. according to the structure of the document) that the author has. Furthermore, H1 has far more margin than P. In addition, headings vary radically in font size and formatting; for example, in my style sheets I have H1 {margin-left: -4%; margin-top: morethanp; line-height: lessthanp; margin-bottom: lessthanp; font-size: radicallydifferentfromnormal}, so to get the P element mapped to the probably/possibly structurally inappropriate H? element is undesirable. THIRTEEN: There is a danger that the classes chosen will be redefined. For example: <p> <font face="Tahoma"> This is text </font> </p> If that were to be tidied, the result would be: <p class="c1"> This is text .c1 {font-family: Tahoma} If on a later date the file were to be re-tidied with this: <p> <font face="Verdana"> This is text </font> </p> , the result would be: <p class="c1"> This is text .c1 {font-family: Verdana} , which would redefine the first example - Tidy does not check to see whether the classes it uses have been used have been used before - it starts from c1 each time it tidies. FOURTEEN If hide-endtags: yes and clean: yes, Tidy drops highly meaningful BODY tags; for example: <body bgcolor=red> or <body style="background: red"> with clean: yes is mapped to: <style type="text/css"> body.c1 {background: red} </style> , but without any BODY for the class to refer to! In this case this is appropriate: <style type="text/css"> body {background: red} </style> because the BODY is unique to the document. FIFTEEN: Why doesn't clean clean TABLE backgrounds? SIXTEEN: Tidy is destroying my pages - it is adding type="text/javascript" to my SCRIPT language="jscript1.2" element. Although type is required, if you add it Internet Explorer will ignore the (yes I know deprecated) language element, which causes serious problems - there should be an option to stop it doing this. ===== ---------------------------------------------------------- From Matthew Brealey (http://members.tripod.co.uk/lawnet (for law)or http://members.tripod.co.uk/lawnet/WEBFRAME.HTM (for CSS)) __________________________________________________ Do You Yahoo!? Talk to your friends online with Yahoo! Messenger. http://im.yahoo.com
Received on Friday, 11 February 2000 05:38:56 UTC