checklink: checklink losing on member-only documents (and/or unhelpful diagnostic message)

Attempting to run the link checker on

     http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.html

I get as result a page with (among other things, none of them the  
results
I was expecting) the message

     Status: 401 Authorization Required WWW-Authenticate: Basic  
realm="W3CACL"
     Connection: close Content-Language: en Content-Type: text/html;  
charset=utf-8"

Further down, there is what appears to be a diagnostic message:

    You need "W3CACL" access to http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.html
    to perform link checking.

    This service has been configured to send authentication only to  
hostnames matching
    the regular expression(?i-xsm:^www\.w3\.org$)

I am confused.  The diagnostic message is quite right:  you DO need
W3CACL access to check the links in the document.  And I have it.
If the link checker doesn't have a password and userid, it is welcome
to ask me for one (as for example the validator currently does, with
success); it isn't doing that.

I think the link checker used to be able to check W3C member-only
documents without problem (I have hard-coded links for checking WG
documents I need to check regularly); if it is still intended to be
able to check member-only documents, it appears to me that something
is broken.

If the service is not currently broken (or even if it is), I think it
would be more helpful if the diagnostic message made clearer what the
user needs to do in order to enable the link checker to check the
document in question.  After reading the diagnostic message, I do not
understand what the message is trying to tell me.

Does it think I do not have W3CACL access to the URI indicated?
If so, why does it think so?  Does it think that *it* does not have
access to the URI indicated?  If so, why?

Does it think that the hostname in the URI indicated matches the
regular expression given?  (In which case, I assume it's sending
authentication messages but encountering some failure.)  Or does
it think that the hostname "www.w3.org" does not match the regular
expression given?  (In which case, why on earth not?) (And for that
matter, the expression given is not a regular expression as that term
is defined by any book on formal languages I have ever read --
I assume the line noise at the beginning is some Perl-inspired
excrescence.  Perhaps you should say "only to hostnames
matching the regular expression (in Perl notation) ..." ?)

I'm sorry not to be able to provide more useful information about
the problem.  If there are any further details I can usefully provide,
please let me know.

Michael Sperberg-McQueen


-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com
* http://cmsmcq.com/mib
* http://balisage.net
****************************************************************

Received on Sunday, 19 April 2009 03:55:50 UTC