- From: Sebastian Lange <lange@cyperfection.de>
- Date: Thu, 13 Jul 2000 18:21:36 +0200
- To: html-tidy@w3.org
Here are, in no specific order, eleven thoughts about the current release of tidy and possible improvements for future releases. This list is certainly not complete, please feel free to add your comments to it! 1) <TEXTAREA> should be recognized by "indent: auto", otherwise you'll get a bunch of spaces in the tidied textarea 2) having multiple <BASE> tags in a document gets them ALL into the header, instead of just one and discarding the rest (which one should get chosen? the first? the last? any other priorities?) 3) XHTML Strict requires <input> to be a child of either "p", "h1", "h2", "h3", "h4", "h5", "h6", "div", "pre", "address", "fieldset", "ins", "del" 4) <form> is required to have an "action" attribute, but tidy doesn't even warn about the lack thereof 5) Warnings about missing elements are omitted for: <!DOCTYPE>, <HTML>, <HEAD>, <BODY>... is there a reason for this behavior? 6) When <TITLE> is missing, Tidy inserts a blank <TITLE></TITLE> pair. Is this desirable? Or should it maybe rather insert something more or less meaningful à la <TITLE>Document edited by Tidy</TITLE>? 7) When "uppercase-tags: yes", doctype should be <!DOCTYPE HTML PUBLIC>, otherwise it should stay <!DOCTYPE html PUBLIC> (what matters is whether it's written <HTML> or <html> in the document... after all an authors' preference only). Or is there a reason for Tidy to always put "html" in lowercase? 8) Like for <IMG ALT>, it should be possible to set a configuration directive for <TABLE SUMMARY>... maybe with a similar accessibility warning, but I would like to be able to automatically process files without being bothered by warnings... and then, in a second run, search for the given string and replace it with something more meaningful. 9) line 113 column 1 - Warning: html doctype doesn't match content Now this is a seriously interesting message, but it doesn't help me much. (Specified doctype was HTML 4.01 Transitional.) Would it be possible to extend this warning in a way that Tidy gives information about WHY the doctype does not match? If you want to reproduce this, try tidying http://www.roche.de/ with "doctype: loose" set, and then RE-RUN the output through tidy again. 10) when '-f "errors.txt" or "--error-file errors.txt", errors should indeed be written to errors.txt, and not to STDERR 11) when '--doctype "-//IETF/DTD HTML 2.0//EN", this doctype should indeed be used, and not '--doctype "auto"' be implied, as it currently happens I think that's it for now, please feel free to add your observations. One word to Dave: Dave, I have great respect for you. Tidy is probably the most exciting tool that I have found regarding HTML. If you see a better way to contribute to its development than discussing thoughts like the aforementioned on this list, please tell. Maybe if someone with a thorough understanding of tidy's source code could post a little introduction to it? -- Sebastian Lange http://www.sl-chat.de/ Maybe the first chat site that validates as HTML 4.0 even though user input may contain HTML codes. Courtesy to Dave Raggett's HTML Tidy: http://www.w3.org/People/Raggett/tidy/
Received on Thursday, 13 July 2000 12:25:25 UTC