Re: Apache module to validate everything

On Thu, 20 Dec 2007 12:11:15 +0100
Sierk Bornemann <sierkb@gmx.de> wrote:

> Hi Will!
> 
> Am 19.12.2007 um 03:35 schrieb Will Entriken:
> 
> >
> > I would like an Apache module to validate every HTML/XHTML page my
> > server serves. If the page doesn't validate, it should dump the
> > page's source to disk and email me.
> >
> > Does this exist?

No.

There are modules that will validate an upload, and reject it if
it doesn't validate against a selected DTD.

There is mod_validator, which is a module that works in a similar
way to the W3C validator.

There are modules that make on-the-fly fixups to documents being
served: for example, mod_proxy_html version 3.  But that still falls
short of full validation: for example, checking that there are no
duplicate IDs or dangling IDREFs would require an entire document tree
to be parsed into memory, which is a much bigger overhead than you
want to incur on every request (to incur it for every *upload* is
different, which is why such modules do exist).

> You might be interested in mod_tidy, http://mod-tidy.sourceforge.net/.
> mod_tidy is a TidyLib based DSO module for the Apache HTTP Server  
> Version 2.

That's somewhat the worst of both worlds: it doesn't validate, but it
does incur the full processing overhead of parsing to a DOM-equivalent.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Received on Thursday, 20 December 2007 11:34:53 UTC