Re: Link Checker Not Following Redirect (or some other problem?)

On Fri, 2002-12-06 at 18:45, Joseph Reagle wrote:

> > Yes, when invoked from the web, checklink limits the recursion scope so
> > that when recursively checking <>, only
> > documents below <> are checked.  When invoked
> > from command line, the -l option can be used for specifying the scope.
> Is there an easy way to have it now recurse when its the same resource when 
> a index.html/Overview.html is used by the Web server?

That's not relevant.  The "base" of the URI is; meaning that checking
<> limits the scope such that *only*
URIs that begin with <> will be checked.  There
have been reports about checklink wandering even to other sites, though,
so bugs may be lurking here.

And regarding your test doc at
the only URI in that document which is in the recursion scope is
<> (which happens
to be the same resource), hence there's little recursion :)

> Does checklink ask for text/html and application/xhtml+xml docs?

Currently, no Accept headers are sent at all.  This will be changed in
CVS soon.

> I content 
> negotiate on the namespace redirect [1] so if you ask for html, that's what 
> you get, if you ask for application/xml you'll get the schema and such. The 
> default is the html document though, so if you're not sending any accept 
> header, maybe I have a bug?

This logic seems to work as such.  But note that the rewrite rules snip
the fragment from the redirect URIs [1], [2].  I don't know if this is
ok.  And if the fragments were in the redirect URIs, would
make any sense?

To be safe, I would probably go for proxying instead of redirecting, if

HEAD /2000/09/xmldsig#sha1 HTTP/1.0
Accept: text/html
--> Location:

HEAD /2000/09/xmldsig#sha1 HTTP/1.0
Accept: text/xml
--> Location:

\/ille Skyttä
ville.skytta at

Received on Friday, 6 December 2002 17:51:39 UTC