W3C home > Mailing lists > Public > www-validator@w3.org > December 2002

Re: Link Checker Not Following Redirect (or some other problem?)

From: Joseph Reagle <reagle@w3.org>
Date: Fri, 6 Dec 2002 11:45:20 -0500
To: Ville Skyttä <ville.skytta@iki.fi>
Cc: www-validator@w3.org
Message-Id: <200212061145.20789.reagle@w3.org>

On Friday 06 December 2002 07:35 am, Ville Skyttä wrote:
> On Fri, 2002-12-06 at 00:40, Joseph Reagle wrote:
> > Looking at:
> > http://validator.w3.org/checklink?uri=http://www.w3.org/Encryption/2001
> >/Drafts/xmlenc-core/Overview.html&recursive=on
> >
> > 1. It looks like it's checking links for '/' and '/Overview.html' (with
> > recursive on)?
>
> Yes, when invoked from the web, checklink limits the recursion scope so
> that when recursively checking <http://foo.bar.org/quux/something>, only
> documents below <http://foo.bar.org/quux/> are checked.  When invoked
> from command line, the -l option can be used for specifying the scope.

Is there an easy way to have it now recurse when its the same resource when 
a index.html/Overview.html is used by the Web server?

> > 2. It's complaining about two fragment identifiers as discussed below.
>
> Hmm.  Just guessing; I think it does follow the redirect, but doesn't
> grok the response.  When retrieving <http://www.w3.org/2000/09/xmldsig>,
> I always get redirected (303) to
> <http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.
>xsd>, which is served as application/xml.  Checklink only checks text/html
> and application/xhtml+xml docs, so it just reports the anchors as broken.

Does checklink ask for text/html and application/xhtml+xml docs? I content 
negotiate on the namespace redirect [1] so if you ask for html, that's what 
you get, if you ask for application/xml you'll get the schema and such. The 
default is the html document though, so if you're not sending any accept 
header, maybe I have a bug?

> Assuming my guess is correct, the "coming soon fix" would be not to
> report anchors broken when they weren't even checked, but to confess what
> actually did (not) happen.

That would be useful.

> Regarding the Accept header, perhaps checklink should send something like
>
>   Accept: application/xhtml+xml, text/html, */*;q=0.5

Yes please! <smile/>


[1] 
RewriteEngine On
RewriteBase /2000/09

RewriteCond  %{HTTP_ACCEPT}   text/xml
RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.xsd 
[R=303,L]

RewriteCond  %{HTTP_ACCEPT}   application/xml
RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.xsd 
[R=303,L]

RewriteCond  %{HTTP_ACCEPT}   application/xml-dtd
RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.dtd 
[R=303,L]

RewriteCond  %{HTTP_ACCEPT}   text/html
RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/Overview.html [R=303,L]

RewriteCond  %{HTTP_ACCEPT}   application/xhtml+xml
RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/Overview.html [R=303,L]

RewriteRule  ^xmldsig$        
http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/Overview.html [R=303,L]
Received on Friday, 6 December 2002 11:45:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:05 GMT