Re: JSON ouput for Markup Validator

Hi,

On Jul 14, 2008, at 06:29, Karl Dubost wrote:

> I decided to look at the [documentation][3] of validator.nu and its  
> JSON output. And I have created a [JSON template][5] and modified  
> accordingly the [check program][4].

Nice!

> Please review, and suggest improvements.
>
>
> [1]: http://dev.w3.org/cvsweb/validator/
> [2]: http://qa-dev.w3.org/wmvs/HEAD/check?output=json&uri=http%3A%2F%2Fyahoo.com
> [3]: http://wiki.whatwg.org/wiki/Validator.nu_JSON_Output
> [4]: http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check
> [5]: http://dev.w3.org/cvsweb/validator/share/templates/en_US/json_ouput.tmpl


Some observations:

  * json_output.tmpl doesn't use the forward-compatible message typing  
of the Validator.nu format. json_output.tmpl uses
"type": "warning",
for warnings whereas the Validator.nu format uses
"type"   : "info",
"subtype": "warning",
in order to keep the semantics of "type" frozen for forward  
compatibility while allowing extensibility in "subtype". It would be  
nice if the W3C Validator adopted the forward-compatible type/subtype  
scheme.

  * It seems that "extract" contains an HTML-escaped snippet intended  
for inclusion as part of the HTML output and marks the point of  
interest with <strong title=\"Position where error was detected.\">  
and </strong>. The Validator.nu format assumes that the extract is not  
HTML-escaped and the point of interest is communicated using the  
"hiliteStart" and "hiliteLength" entries.

  * json_output.tmpl uses "firstColumn" instead of "lastColumn". I was  
unable to check whether this matters, because I was unable to provoke  
error positions longer than 1 character.

  * It seems that 0-based column counts are emitted, although the  
format uses 1-based column counts. That is, 1 should be added to the  
column.

* json_ouput.tmpl emits the numeric values as JSON strings as opposed  
to JSON numbers as specced for the Validator.nu format.

* The "explanation" and "messageid" keys aren't part of the format  
spec, but I'd be happy to add them as optional with a note that they  
are emitted by the W3C Validator. Currently, json_ouput.tmpl outputs  
"messageid" as a string even though the string always contains a  
formatted number. I think it would make sense to keep it that way,  
since if Validator.nu adds message ids in the future, the ids will  
likely be strings--not numbers.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Monday, 14 July 2008 10:03:53 UTC