- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Thu, 23 Apr 2009 17:10:58 -0600
- To: www-validator@w3.org
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
It appears that either the --masquerade option is not working, or the documentation could usefully be revised to make clearer how to use it. The summary of options at http://search.cpan.org/dist/W3C-LinkChecker/bin/checklink.pod says: --masquerade "local remote" Masquerade local dir as a remote URI. For example, the following results in /my/local/dir/ being "mapped" to http://some/remote/uri/ --masquerade "/my/local/dir http://some/remote/uri/" I understand this to mean that if the document being checked contained a link to (for example) http://some/remote/uri/foo.html then checklink would not attempt to communicate with the remote server, but would check the local filesystem for a file called /my/local/dir/foo.html This would make it convenient to prepare a set of interlinked documents locally, link check them, and correct the errors before uploading them to a public server. So far so good. That is what I am trying to do. (If this is not what masquerade is intending to do, it suggests an opportunity for improving the man page -- I'll happily suggest wording, if I can ever understand what masquerade does and how it works.) But using (the equivalent of) checklink --masquerade ". http://example.org/x/y/" --masquerade "../z http://example.org/x/z/" doc.html did not produce the expected results: checklink complained about things being missing from example.org/x/y even though they were present in the current directory. It complained, for example, about a link to http://example.org/x/y/doc.html being a bad link, though doc.html is definitely present in the local directory masquerading as http://example.org/x/y/ -- it's the document being checked. I concluded that I had misread the documentation, or that there were unexpected constraints on the syntax of the paired arguments. I tried the arguments in various forms; I tried them local-first and remote-first. I made a test file (attached) named testdoc.html, which has links to http://www.w3.org/XML/Activity.html and to http://www.w3.org/XML/testdoc.html, which does not exist. In the directory containing testdoc.html, there is no Activity.html. When I run checklink --quiet testdoc.html I am told, as expected, that http://www.w3.org/XML/testdoc.html produces a 404. When I run checklink --quiet --masquerade ". http://www.w3.org/XML/" testdoc.html I get the same result. I have run this test case with the local argument in the forms "." "./" "/Users/cmsmcq/2009/misc" "/Users/cmsmcq/2009/misc/" "file:///Users/cmsmcq/2009/misc" "file:///Users/cmsmcq/2009/misc/" and the remote argument in the forms "http://www.w3.org/XML" "http://www.w3.org/XML/" with the arguments in the order remote - local and local - remote. All 24 permutations produce the same result, which suggests that in no case am I succeeding in making masquerading do anything at all. Are my expectations inconsistent with the intent? Or is the code broken? One further note: when my bash command was insufficiently escaped, some variants did elicit a complaint about Use of uninitialized value in pattern match (m//) at /usr/local/bin/ checklink line 201. Use of uninitialized value in string eq at /System/Library/Perl/Extras/ 5.8.8/WWW/RobotRules.pm line 152. Use of uninitialized value in string eq at /System/Library/Perl/Extras/ 5.8.8/WWW/RobotRules.pm line 152. Use of uninitialized value in pattern match (m//) at /usr/local/bin/ checklink line 201. Use of uninitialized value in string eq at /System/Library/Perl/Extras/ 5.8.8/WWW/RobotRules.pm line 152. Use of uninitialized value in string eq at /System/Library/Perl/Extras/ 5.8.8/WWW/RobotRules.pm line 152. which suggests a problem on some other path through the code. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Thursday, 23 April 2009 23:11:43 UTC