Re: Random thoughts on modularization of validator(s)

On Tue, 20 Jul 2004, olivier Thereaux wrote:

> Now for the obvious: all our tools (including the link checker) are
> built around three basic operations: retrieval of content, parsing and
> checking, presentation of results. All of this wrapped in an iterface,
> usually showing access to the first operation and providing some output
> for the last.

You're sounding like my presentation at ApacheCon last year:-)

Executive summary: Page Valet and AccessValet perform different
checking but have common components in retrieval-of-content and
presentation of results.  So it's modularised, with the tools
tharing retrieval and presentation modules.

> Is it fair to introduce the idea of a "global" multi-format validation
> service when discussing the modularization and future arch of the
> Markup Validator?

Indeedie.  That's exactly what became a very attractive option with the
release of Apache 2.0 that supports such modularisation nicely.

> The fact that introducing requirements for a "generic conformance
> checking tool" into the modularization of one specific tool will slow
> the latter is valid, yet probably just on a mid-term basis.

You mean it will slow development?  I think it's ... no, let's not
go on the record with that.

> In a rather simplified view of things, I see three main questions :
> 1- how to tie the blocks together?
> 2- why only one way to tie the blocks together?
> 3- should the process of parsing/validation be purely iterative?
>
> 1- I am relatively programming-language agnostic, and I see that we are
> likely to want to use different technologies at the parser level. tying
> it all together with SOAP seems more attractive than sticking with one
> API, concerns of performance notwithstanding. But I might be completely
> wrong on this.

Would it be politically feasible to gather all such services under
validator.w3.org?  If so, we can handle that in Apache, by proxying
backends using a different technology (such as jigsaw).

> 2- Would it make sense to have the blocks accessible through different
> kinds of interfaces/protocols? Would it be a waste of time to do that?
> Then there is the question of which one(s) we favour for the main
> service(s)

Does anyone _really_ want SOAP?

We have the browser-oriented protocols well-covered (get URL, file
upload), though they could be further refined.  A very simple
service-implementation would be to accept HTTP PUT of a document
and validate it.  Choices for the user (client software) to be
negotiated in custom HTTP headers.

> 3-  I see the point of an iterative parsing and validation process,
> such as well-formedness->validity->attributes->appC, yet the trend

+1 to working in that direction

> seems to point towards something more complicated [insert vague
> ramblings on dividing and processing parallel validation, using trendy
> acronyms e.g NRL, DSDL - I am yet to investigate these fields
> further.]. Should we aim at doing that too?

Huh?

-- 
Nick Kew

Received on Tuesday, 20 July 2004 07:36:18 UTC