- From: olivier Thereaux <ot@w3.org>
- Date: Wed, 3 Jan 2007 14:36:34 -0500
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Karl Dubost <karl@w3.org>, www-validator Community <www-validator@w3.org>
Hello Henri, I have seen your assessment of the validator formats, and find them interesting, partly because they echo some of the mild gripes I have with them (I did not come up with the formats, FWIW), partly because I disagree with you but find your analysis interesting nevertheless. On Dec 18, 2006, at 05:13 , Henri Sivonen wrote: > On Dec 18, 2006, at 08:42, Karl Dubost wrote: >> Could you explain what part is complex? >> Or what makes their complexity? > > * Messages are grouped by type (error, warning, misc) instead of > just lumping them together in the order the messages were generated > in the validation process. (The grouping is redundant and requires > buffering.) This is reminding me of a lot of discussions I've read and had on the subject. I am afraid there are two schools here that will never be reconciled... One (and I would place myself there) prefer to treat problems sequentially, regardless of their importance. The others would rather fix errors first, then warnings. The former will prefer the output of the markup validator, the latter, that of the CSS validator. You have a good point that the grouping requires buffering, which is not the most efficient. But I think I would like the parser/validator to count the errors anyway, so buffering is not a problem in this perspective. > * The message groups have double containers (errors and errorList). That is indeed a bit daft, but I understand it's been made this way to allow for the count of errors, warnings, etc. to be presented to the user. > * For each message type, the generator of the messages has to > count the messages and indicate the count before the messages. (The > message counts are redundant data and generating them requires > buffering.) Having seen how convenient it is for the user to know the number of errors without having to count them, I disagree with you here. When it comes to counting or sorting or other such processing, there will always be a tension because neither producer nor user of a format wants to use the cpu cycles. In such a context, isn't it a good idea that the producer, not the user, pays this price if the producer really wants the tool to be used? > > * The formats echo information that the client already knows such > as the URI of the validator, the URI of the input or in the case of > the Unicorn format, the date. > > * The formats have unnecessary telescoping envelope elements. (A > SOAP 1.2 format message ends with </m:markupvalidationresponse></ > env:Body></env:Envelope>, where </env:Body></env:Envelope> is just > cruft.) > > * The SOAP format has SOAP namespace cruft. The Unicorn format has > XSD cruft. I think these are reasonably cheap, especially if the benefit is being processable by more engines (soap-enabled ones, schema-based parsers, etc.). > * The formats represent line and column numbers as text content of > elements as opposed to attributes. I can't parse this. Please explain. > * The formats require a boolean pass/fail proclamation near the > start of the format. (This is redundant and requires buffering.) Ditto counting. > EARL, which I initially missed, also has problems: I'll skip this, but you may want to talk to the group that makes EARL: http://www.w3.org/WAI/ER/ Thanks, -- olivier
Received on Wednesday, 3 January 2007 19:36:31 UTC