W3C home > Mailing lists > Public > www-validator@w3.org > July 2007

Re: Bug in Validator - more vivid example

From: olivier Thereaux <ot@w3.org>
Date: Thu, 26 Jul 2007 11:46:10 +0900
Message-Id: <50323B15-91E6-4482-A77F-32144D308716@w3.org>
Cc: <www-validator@w3.org>
To: Artemy Lomov <artemy@lomov.ru>

Hello Artemy, all,

On Jul 25, 2007, at 15:10 , Artemy Lomov wrote:
> Even this page is 'not valid' XHTML:
>
> http://validator.w3.org/check?uri=http%3A%2F%2Fvalidator.w3.org%2F:
>
> ***
>
> Failed validation, 4 Errors
>
> Line 425, Column > 80: XML Parsing Error: Entity 'nbsp' not defined.
> Line 425, Column > 80: XML Parsing Error: Entity 'nbsp' not defined.
> Line 450, Column > 80: XML Parsing Error: Entity 'copy' not defined.
> Line 451, Column > 80: XML Parsing Error: Entity 'reg' not defined.

This was a hard bug to hunt down, hard to reproduce as reloading the  
same validation page would sometimes give different results.

We found that the libxml2-based parser would fetch a lot of schema  
and entity files, resulting in being sometimes temporarily banned by  
www.w3.org servers. As a result, entities could not be dereferenced,  
and the parser would throw errors.

We fixed the issue by not letting the xml parser fetch remote DTD/ 
entity files, and filtering out errors about undefined entities. The  
fix is in production as of now.

-- 
olivier
Received on Thursday, 26 July 2007 02:45:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:25 GMT