- From: Reitzel, Charlie <CReitzel@arrakisplanet.com>
- Date: Mon, 7 May 2001 13:54:24 -0400
- To: "'dsr@w3.org'" <dsr@w3.org>
- Cc: "'Bjoern Hoehrmann'" <derhoermi@gmx.net>, "'Andy Quick'" <ac.quick@sympatico.ca>, "'Sebastian Lange'" <info@sl-chat.de>, 'André Blavier' <ablavier@wanadoo.fr>, "'html-tidy@w3.org'" <html-tidy@w3.org>
Hi Dave, Any plans for these types of changes or even a maintenance release for all the patches since last summer? I didn't see a response to Bjoern's message on the list. Tidy has been a tremendous help, and I'm looking to do more work with it. Expat might be a good organizational model to follow. I'd also like to contribute more updates to the Word 2000 conversion (I mailed in one bug fix patch to the mailing list). I'd also like to get all the other patches that people have mailed in. A Source Forge project would make it much easier to collect bug reports and patches and apply them. How do other developers of Tidy add-ons feel about this issue? Thanks for any info, Charlie -----Original Message----- From: Bjoern Hoehrmann [mailto:derhoermi@gmx.net] Sent: Friday, April 13, 2001 1:31 PM To: html-tidy@w3.org Cc: dsr@w3.org Subject: Making HTML Tidy a supported library Hi Dave, Hi List, As [1] states: "You can also incorporate Tidy as part of a larger program, for instance in HTML editors or HTML transformation tools used for import filters, or for when you want to customize Web content to get the best out of different kinds of browsers. [...]", Software Developers are free or even encouraged to use HTML Tidy in their applications. This is a great thing in general, but the current codebase makes it hard to do so, incorporating Tidy into other applications requires hacking the provided sources. I'm at most a rookie C programmer, but I think it might be a good idea to split Tidy into three modules: * parser and lexer parsing and cleaning the source, building in-memory DOM tree * pretty-printer takes the DOM tree and puts it into a "prettier" form * command line utility parse configuration files, parse input from the command line, printing the prettier version, etc. I think several people, including myself, use HTML Tidy just to get a clean parse tree, e.g. for using XML tools with HTML documents, they only need the first module. HTML Editors who incorporate HTML Tidy often provide the option to beautify the written and/or produced code, they additionally use the second module. The third module is of no interest for other people, so the library should consist only of the first two modules. Just like it is done with Expat (James Clarks XML parser), users and developers can just get the current library and link it into their programs, instead of hacking the source again and again and redistributing it. Along with this m12n several other changes could and possibly should be made, like * event-handlers for warnings and errors * killing global configuration variables, introduce a parser and a pretty-printer object that store relevant informations or store that information in the lexer object I like to hear comments by other list members regarding these changes. I don't know what the current state of HTML Tidy is, the current codebase is however nearly a year old, the pending page is very longish, there are several additional bugs reported to this list and some additional feature requests have been made. So I like to ask, whether there are volunteers as requested in [2] and how you think development and maintainship should look like in the future. Why isn't HTML Tidy in the W3C CVS repository? [1] http://www.w3.org/People/Raggett/tidy/ [2] http://www.w3.org/People/Raggett/tidy/pending.html best regards, -- Björn Höhrmann > mailto:bjoern@hoehrmann.de > http://www.bjoernsworld.de am Badedeich 7 # Telefon: +49(0)4667/981028 # http://bjoern.hoehrmann.de 25899 Dagebüll < PGP Pub. KeyID: 0xA4357E78 < http://www.learn.to/quote/
Received on Monday, 7 May 2001 17:08:53 UTC