Re: JSON ouput for Markup Validator

On Jul 14, 2008, at 14:37, olivier Thereaux wrote:

> On 14-Jul-08, at 6:03 AM, Henri Sivonen wrote:
>> * json_output.tmpl doesn't use the forward-compatible message  
>> typing of the Validator.nu format. json_output.tmpl uses
>> "type": "warning",
>> for warnings whereas the Validator.nu format uses
>> "type"   : "info",
>> "subtype": "warning",
>> in order to keep the semantics of "type" frozen for forward  
>> compatibility while allowing extensibility in "subtype". It would  
>> be nice if the W3C Validator adopted the forward-compatible type/ 
>> subtype scheme.
>
> No strong opinion either way, but I rather fail to see why using  
> type="error,warning,info" would hamper extensibility. I have  
> honestly no idea what a "forward-compatible" format is, which may  
> explain my puzzlement here. Could you detail the rationale?

The forward-compatible processing rules for the format make it  
possible to determine the validation outcome (valid/invalid/ 
indeterminate) by looking at the frozen supertypes of the messages.  
Message subtypes can be added without breaking clients that implement  
the processing model as specified.
http://wiki.whatwg.org/wiki/Validator.nu_JSON_Output#Determining_Outcome

There are fundamentally three types of messages:
  1) Those that indicate a problem outside the document (e.g. out of  
memory). A presence of such a message makes the validation outcome  
indeterminate (and the next steps don't count).
  2) Errors in the document. A presence of such a message makes the  
document invalid.
  3) Informative messages that do not affect the validation outcome.

Warnings do not affect the validation outcome. Hence, they are a  
subclass of type 3 (informative) messages.

The only thing that makes this a bit ugly is that warnings are so  
common and this scheme uses a subtype for a common case. However,  
changing the scheme at this point would mean thawing a part of the  
format that was supposed to be frozen by design for forward compat  
thereby reneging the forward-compatibility promise.

>> * The "explanation" and "messageid" keys aren't part of the format  
>> spec, but I'd be happy to add them as optional with a note that  
>> they are emitted by the W3C Validator.
>
> Any opinion on what format to adopt there? Would escaped HTML (as it  
> is now) make sense?

The XML, HTML and XHTML outputs of Validator.nu contain an  
"elaboration" which is similar the "explanation" in the W3C Validator.  
I omitted the feature from the JSON format, because XHTML fragments  
aren't a natural fit for JSON.

If you want to make explanations available in JSON output despite the  
fragments not being a natural fit for JSON, I suppose putting HTML  
source code fragments in a JSON string is the least bad alternative.  
For completeness, it would be nice to specify how to parse them (e.g.  
by saying that they are parsed using the HTML5 parsing algorithm in  
the fragment mode with "div" as the context element).

>> Currently, json_ouput.tmpl outputs "messageid" as a string even  
>> though the string always contains a formatted number.
>
> it can also be a string.
>
>> I think it would make sense to keep it that way, since if  
>> Validator.nu adds message ids in the future, the ids will likely be  
>> strings--not numbers.
>
> indeed.

OK.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Thursday, 17 July 2008 11:32:07 UTC