- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Fri, 13 Apr 2001 19:30:54 +0200
- To: html-tidy@w3.org
- Cc: dsr@w3.org
Hi Dave, Hi List, As [1] states: "You can also incorporate Tidy as part of a larger program, for instance in HTML editors or HTML transformation tools used for import filters, or for when you want to customize Web content to get the best out of different kinds of browsers. [...]", Software Developers are free or even encouraged to use HTML Tidy in their applications. This is a great thing in general, but the current codebase makes it hard to do so, incorporating Tidy into other applications requires hacking the provided sources. I'm at most a rookie C programmer, but I think it might be a good idea to split Tidy into three modules: * parser and lexer parsing and cleaning the source, building in-memory DOM tree * pretty-printer takes the DOM tree and puts it into a "prettier" form * command line utility parse configuration files, parse input from the command line, printing the prettier version, etc. I think several people, including myself, use HTML Tidy just to get a clean parse tree, e.g. for using XML tools with HTML documents, they only need the first module. HTML Editors who incorporate HTML Tidy often provide the option to beautify the written and/or produced code, they additionally use the second module. The third module is of no interest for other people, so the library should consist only of the first two modules. Just like it is done with Expat (James Clarks XML parser), users and developers can just get the current library and link it into their programs, instead of hacking the source again and again and redistributing it. Along with this m12n several other changes could and possibly should be made, like * event-handlers for warnings and errors * killing global configuration variables, introduce a parser and a pretty-printer object that store relevant informations or store that information in the lexer object I like to hear comments by other list members regarding these changes. I don't know what the current state of HTML Tidy is, the current codebase is however nearly a year old, the pending page is very longish, there are several additional bugs reported to this list and some additional feature requests have been made. So I like to ask, whether there are volunteers as requested in [2] and how you think development and maintainship should look like in the future. Why isn't HTML Tidy in the W3C CVS repository? [1] http://www.w3.org/People/Raggett/tidy/ [2] http://www.w3.org/People/Raggett/tidy/pending.html best regards, -- Björn Höhrmann > mailto:bjoern@hoehrmann.de > http://www.bjoernsworld.de am Badedeich 7 # Telefon: +49(0)4667/981028 # http://bjoern.hoehrmann.de 25899 Dagebüll < PGP Pub. KeyID: 0xA4357E78 < http://www.learn.to/quote/
Received on Friday, 13 April 2001 13:29:25 UTC