W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2001

Announce: TclTidy (using Tidy as a shared lib/dll)

From: Scott Redman <redman@tivo.com>
Date: Wed, 24 Jan 2001 10:26:02 -0800
Message-ID: <3A6F1E3A.D92714A9@tivo.com>
To: html-tidy@w3c.org
This message is to announce the availability of TclTidy,
making Tidy a loadable Tcl extension (shared library/DLL).

Please note, most of the changes are not specific to Tcl,
but could be used to make a loadable extension for any
scripting language or to use Tidy as a C/C++ shared library.

You can find TclTidy sources in the TclXML project on 
SourceForge (the sources include the full set of Tidy
source as well):

    http://sourceforge.net/projects/tclxml

Ajuba Solutions, my former employer, sold an XML B2B product
based around Tcl.  We found a need to convert HTML to XHTML
and decided to use Tidy.  We quickly found that using "exec"
to launch tidy over and over again wasn't going to scale.
We decided to make Tidy a loadable extension that would reset
itself after each call and allow us to call the code many
times.  The work that I did to make Tidy a loadable extension
was agreed to be placed back in Open Source, and the license
is the same MIT license that Dave has on the original Tidy.
Ajuba was acquired on November 1, 2000 by Interwoven and it
took several weeks to get the code on to Source Forge (due
to problems with SF and their recent ISP change).

The changes involved were primarily to allow Tidy to reset
its state between calls (reentrant, not in the threads sense,
though), to make the code reasonably thread-safe by moving
global variables into local state, making the code play
well when loaded as a shared library by preventing symbol
name collisions, and changing the i/o layer to allow a
string buffer to be used instead of stdio.  The code is
now C++, primarily to use namespaces to prevent symbol
name collisions.  There is a very thin layer of Tcl extension
code to make it work in Tcl.  The code is based on the 
August 4th release of Tidy, and includes all of the Tidy 
sources (in C++).

There is a long list of things I want to eventually do
to the code:

1. Merge my changes with the Tidy sources, if possible,
   to prevent code drift/rot.
2. Separate TclTidy from Tidy, using different repositories
   (fits nicely with #1).  The tidy executable would load 
   the libtidy.so and use it, as would TclTidy.
3. Make the code really thread-safe.
4. Check for and fix memory leaks.
5. more things I can't remember right now...

Please note, after leaving Interwoven to work for TiVo,
I don't have as much time to devote to Tidy work.  I am
going to try to make time for the various Open Source
projects that I'm involved with, but it may be a few
weeks/months before I can fully devote time to this
effort.

-- Scott Redman
-- Member, Technical Staff
-- TiVo, Inc.
Received on Wednesday, 24 January 2001 13:27:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT