W3C home > Mailing lists > Public > www-validator@w3.org > June 2009

Re: Error message not giving associated filename

From: Ville Skyttä <ville.skytta@iki.fi>
Date: Tue, 23 Jun 2009 21:49:47 +0300
To: www-validator@w3.org
Message-Id: <200906232149.47810.ville.skytta@iki.fi>
> Etienne Miret a écrit :
> > the gzip file is corrupt. Hence, gunzip produces no output when given
> > the file, and the validator itself parses an empty document.
>
> Actually, this could be a bug with gunzip, just as it could be a bug
> with your own compression software. After all, Safari and Firefox are
> both able to uncompress the file. I’m afraid I can’t be of much help here.

Don't be so modest, that's quite helpful, at least to me ;)

My guess is that it is a bug with the target site's compression software 
(response headers contain "X-Compressed-By: DotNetNuke-Compression"), gzip's 
been around so long that I'd find it very surprising if it was its fault.

$ curl -s --header "Accept-Encoding: gzip" http://www.jallanstudios.com/ | 
gzip -dc > /dev/null
gzip: stdin: unexpected end of file

Also, what I was able to find out from under the hood in libraries the 
validator uses, decoding the response fails in the Perl Compress::Zlib module 
when it is doing final data length and CRC32 sanity checks.

Based on those two findings, I suppose the compressed data from the above URL 
ends prematurely.  gzip decodes data as long as it can and shows what it got 
out and only then reports the above error, ditto I suppose Safari and Firefox.  
The libwww-perl and Compress-Zlib libraries used by validator behave 
differently at the moment - validator either gets all of the decompressed data 
or nothing.

Anyway, I added better error handling for this kind of decode failures in CVS, 
try validating the above URL with http://qa-dev.w3.org/wmvs/HEAD/ if 
interested.  Also, added a RFE for Compress::Zlib so it'd report more errors 
in a way that libwww-perl could use and pass to us: 
http://rt.cpan.org/Public/Bug/Display.html?id=47283
Received on Tuesday, 23 June 2009 18:50:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:35 GMT