W3C home > Mailing lists > Public > public-qa-dev@w3.org > February 2009

Re: libxml2 errors list for markup validator

From: olivier Thereaux <ot@w3.org>
Date: Wed, 18 Feb 2009 13:40:02 -0500
Cc: "public-qa-dev@w3.org list" <public-qa-dev@w3.org>
Message-Id: <0613FB03-DCA5-4F0D-9E25-4085C1D35BEC@w3.org>
To: Karl Dubost <karl+w3c@la-grange.net>
Hi Karl,

Thanks for your research! It's not all good news, but it really helps.

What I understand from your findings so far is that we don't have a  
1-1 mapping/definition of all the libxml2 errors in one place. That's  
going to be a bit more painful that we hoped.

I looked at a case not in the list you found:
http://qa-dev.w3.org/wmvs/HEAD/check?uri=http://qa-dev.w3.org/wmvs/HEAD/dev/tests/2689-attribute-no-space.xhtml;ss
yields an error titled "attributes construct error" (fairly unhelpful  
error message, typically something we would want to improve)
The libxml module tells me that it is error 65.

Firing up my text editor, I looked for "attributes construct error" in  
the codebase for libxml2 and found:
[[
	if (!IS_BLANK_CH(RAW)) {
	    xmlFatalErrMsg(ctxt, XML_ERR_SPACE_REQUIRED,
			   "attributes construct error\n");
	}
]] -- parser.c


This is particularly interesting to me:
* it shows me where the "attributes construct error" text came from
* it shows that the actual code for the error is XML_ERR_SPACE_REQUIRED
... and frankly, knowing that the error comes from missing space  
(between attributes) is hugely useful there

I think I'm trying to see a pattern there:
1) some errors have their text defined in a nice, clean way in  
parser.c (static void xmlFatalErr)
2) other errors are more fuzzy, maybe because some error codes are  
actually shared between cases? (I could imagine that  
XML_ERR_SPACE_REQUIRED could be shared between attribute error cases  
and other cases, and indeed I just found

[[
	if (!IS_BLANK_CH(CUR)) {
	    htmlParseErr(ctxt, XML_ERR_SPACE_REQUIRED,
	                 "Space required after 'SYSTEM'\n", NULL, NULL);
	}
]] -- HTMLparser.c

This is not really great for us, but I think we could find ways to  
work around the problem.

More on this later.

-- 
olivier
Received on Wednesday, 18 February 2009 18:40:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 August 2010 18:12:49 GMT