Re: UniCORN book of specifications from olivier Thereaux on 2006-05-17 (public-qa-dev@w3.org from May 2006)

From: olivier Thereaux <ot@w3.org>
Date: Wed, 17 May 2006 10:29:19 +0900
To: QA Dev <public-qa-dev@w3.org>
Message-Id: <aa1cd84c22f190ec3fe74ed951fd2c02@w3.org>
Hi,

On May 15, 2006, at 17:20, Jean-Guilhem Rouel wrote:
>  Damien and I have written a document available at 
> http://www.w3.org/QA/2006/obs_framework/ about the specifications of 
> the micro-observer framework.
> It contains a description of the framework requirements, but also use 
> cases and questions more or less technical about specific points.

Here are a few notes from a first pass at the document.

[[
The aim of our internship is to create a "universal validator" that 
will be able to validate and check multiple things in a document 
through a single Web interface.
]]

As you said in your "questions", the term validator here is not 
appropriate. I think what you are building is an observation framework 
for Web documents. Universal is, well, nice in the acronym, but perhaps 
a bit too much here ;) And validation will only be a tiny part of the 
tasks the framework will allow its modules to perform.

also s/check multiple things/perform multiple observations and checks/ 
perhaps?

looking at http://www.w3.org/QA/2006/obs_framework/#result I see that 
you are proposing to reuse a template syntax similar to that of the CSS 
validator. It is not a template syntax I am very familiar with, is it a 
"standard" templating syntax for java, or something that was invented 
for the CSS validator. If the latter, it may be a good idea (or a waste 
of time, yours to decide) to look at other template syntaxes.

Regardless of the syntax choices, which are not really important anyway 
as long as it is documented somewhere, I think that there will be some 
work in defining how context is/will be passed, how looping will be 
done, etc.

For instance, I understand that something like <!-- #error_line --> 
will be replaced by the value of the current error's line. But 
"current" has to be well defined, especially if we loop over a nomber 
of errors. Which leads me to wondering why <!-- #errors --> will not be 
replaced by a value, but by a loop.

In that sense, the templating syntax of e.g HTML::Template, while 
clunky too in some aspects, have the advantage of dissociating looping, 
logic (if, then, else) and variable substitution. Poke around 
http://dev.w3.org/cvsweb/validator/share/templates/en_US/ for examples.


On output formats, other potential candidates: send mail, or simply 
plain text (for command line usage). I think the way you envision the 
output, with the central module gathering and passing observers' output 
through templates, is the way to go. I'm not sure if I showed you or 
told you about the "log validator", which has a lot of similarities 
with what you are working on: a modular architecture, several 
observers, and several possible output methods. The big difference is 
that the logvalidator works on a list of documents (usually, a log file 
from the web server) and not on one, so the focus is different. The 
reason why I think of the logvalidator now is that after a few 
releases, I found myself limited in what the tool outputs, because the 
output from each "observation" module was not rich and structured 
enough. For your framework, making sure that the output for each 
observation is as structured as possible (i.e avoiding loose text) will 
be a key thing.


I was at first puzzled by the usage of "actors" word in the use cases. 
That's because for me, the actors are really the various observers, the 
central module, etc. To make it less confusing, I would perhaps call 
them "users" - framework maintainers being super users, computers being 
automated users, etc.


One note I thought of when reading the last use case about 
"incompatible check": should the framework have a silent recovery 
mechanism when asked to send a document through an irrelevant 
component/observer. I suspect that there will be "default" observers in 
the interface (markup, css), it would be nice if one could send a css 
file without getting an error message about the file not having markup 
to check.

This is related to one of the first questions, "who will parse the 
document". I think one answer to that is that the central module always 
will know the mime type of the document being checked, and based on 
that, it will be able to either:

- dispatch to observers it knows will be relevant (e.g type is SVG -> 
send to XML wellform checker, SVG conformance checker, CSS too perhaps, 
etc)
- try to pre-parse documents for which it has a basic knowledge, and 
the dispatch (e.g HTML)
- consider the type to be outside of its limits (e.g some gif image - 
no relevant observer exists)

now quickly going through other questions...

[[ How observers should interact with each others and with UniCORN? ]]

I think a centralized solution would be much simpler. It doesn't mean 
that there cannot be some complex sequence. For example, we could think 
that observer B will only be called if observer A has returned a 
certain type of observation, on either the whole document or a specific 
fragment. But making observers independently communicate would 
certainly make things too complicated.

Yves may have some insight on this, with Web Services in mind.

[[ Which implementation for the framework? ]]

fortran.

[[ Use of WSDL/WADL ]]

My (limited) knowledge of WSDL tells me that it should work to define 
the service contract between the central module and observers. Do talk 
to Yves about it, he will certainly know documents to point you to in 
order to understand WSDL better for your needs.



All for now... Does that provide you with material to make progress?
cheers,
-- 
olivier
Received on Wednesday, 17 May 2006 01:29:32 UTC