Re: Apache module to validate everything

From: Nick Kew <nick@webthing.com>
Date: Thu, 20 Dec 2007 11:34:38 +0000
To: www-validator@w3.org
Message-ID: <20071220113438.38482ff9@grimnir>

On Thu, 20 Dec 2007 12:11:15 +0100
Sierk Bornemann <sierkb@gmx.de> wrote:

> Hi Will!
> Am 19.12.2007 um 03:35 schrieb Will Entriken:
> >
> > I would like an Apache module to validate every HTML/XHTML page my
> > server serves. If the page doesn't validate, it should dump the
> > page's source to disk and email me.
> >
> > Does this exist?


There are modules that will validate an upload, and reject it if
it doesn't validate against a selected DTD.

There is mod_validator, which is a module that works in a similar
way to the W3C validator.

There are modules that make on-the-fly fixups to documents being
served: for example, mod_proxy_html version 3.  But that still falls
short of full validation: for example, checking that there are no
duplicate IDs or dangling IDREFs would require an entire document tree
to be parsed into memory, which is a much bigger overhead than you
want to incur on every request (to incur it for every *upload* is
different, which is why such modules do exist).

> You might be interested in mod_tidy, http://mod-tidy.sourceforge.net/.
> mod_tidy is a TidyLib based DSO module for the Apache HTTP Server  
> Version 2.

That's somewhat the worst of both worlds: it doesn't validate, but it
does incur the full processing overhead of parsing to a DOM-equivalent.

Nick Kew

Nick Kew
Received on Thursday, 20 December 2007 11:34:53 UTC

