Re: JSON ouput for Markup Validator

Hello Henri, Hi all,

On 14-Jul-08, at 6:03 AM, Henri Sivonen wrote:
> * json_output.tmpl doesn't use the forward-compatible message typing  
> of the Validator.nu format. json_output.tmpl uses
> "type": "warning",
> for warnings whereas the Validator.nu format uses
> "type"   : "info",
> "subtype": "warning",
> in order to keep the semantics of "type" frozen for forward  
> compatibility while allowing extensibility in "subtype". It would be  
> nice if the W3C Validator adopted the forward-compatible type/ 
> subtype scheme.

No strong opinion either way, but I rather fail to see why using  
type="error,warning,info" would hamper extensibility. I have honestly  
no idea what a "forward-compatible" format is, which may explain my  
puzzlement here. Could you detail the rationale?

> * It seems that "extract" contains an HTML-escaped snippet intended  
> for inclusion as part of the HTML output and marks the point of  
> interest with <strong title=\"Position where error was detected.\">  
> and </strong>. The Validator.nu format assumes that the extract is  
> not HTML-escaped and the point of interest is communicated using the  
> "hiliteStart" and "hiliteLength" entries.

Seems like the highlighting routine in check should be made format- 
independent, or at least take format as an option. Karl, are you on  
this or should we work on it together?

> * It seems that 0-based column counts are emitted, although the  
> format uses 1-based column counts. That is, 1 should be added to the  
> column.

I don't think the format is 0-based and 1 should be added, but indeed  
in some cases (missing doctype) the parser yields a message at line 1,  
col 0. That could/should be escaped in the error generation routine.

> * The "explanation" and "messageid" keys aren't part of the format  
> spec, but I'd be happy to add them as optional with a note that they  
> are emitted by the W3C Validator.

Any opinion on what format to adopt there? Would escaped HTML (as it  
is now) make sense?

> Currently, json_ouput.tmpl outputs "messageid" as a string even  
> though the string always contains a formatted number.

it can also be a string.

> I think it would make sense to keep it that way, since if  
> Validator.nu adds message ids in the future, the ids will likely be  
> strings--not numbers.

indeed.

Thanks,
-- 
olivier

Received on Monday, 14 July 2008 11:38:23 UTC