Terje, thanks for looking into my patches again; here's some comments and a new set of them. > * base-css.patch > - Applied (apart from the style for "label" that I forgot, > but will be in the next checkin). Not necessary, I think, see the updated HTML patch below. > * check-cfg.patch: > - Applied and expanded on. Ok. There may be something weird in the CVS, check.cfg and tips.cfg are shown in the Attic? However, checkout of validator-0_6_0 works ok and fetches these files too. http://dev.w3.org/cvsweb/validator/htdocs/config/?hideattic=0&only_with_tag=validator-0_6_0 > * check-protocols.patch: > - Applied. > - Also led to some infrastructure work that was much needed. > I reworked your patch slightly to fit into this new scheme. I realized that this patch was pretty ugly and insufficient, and accidentally found out that the same functionality is actually in libwww-perl 5.53_94 and newer. The new version below still takes advantage of the new "exception framework", but also adds a configuration option for allowed protocols, defaulting to HTTP and HTTPS. Also, a similar implementation for checklink.pl, using $ua->protocols_allowed() would clean it up somewhat. Quick comments about the new patches: - check-html.patch: HTML related fixes. - check-paths.patch: Portability patch. - check-protocols.patch: Cleaner and better supported/allowed protocol handling. - checklink-options.patch: Improved command line option handling. - checklink-cleanup.patch: General cleanups and HTML fixes. - also available from http://cachalot.ods.org/patches/w3c-validator/ and http://koti.welho.com/vskytta/patches/w3c-validator/ Then, a few comments to the TODO list at http://validator.w3.org:8001/todo.html [0.6.0, 1.] - I tried the absolute URI fixup a few months ago to make validator easier to install locally, and found it generally possible, but the currently used SSI's are a problem. Also the http://foo/check/referer needs to be taken care of (at least if there is a "Revalidate" link on the page). [Misc, 25,33.] - SSL/TLS and the error count seem to be already covered by recent updates. [Misc, new entry] - Check if this needs a fix: http://lists.w3.org/Archives/Public/www-validator/2002Mar/0055.html Cheers, -- Ville Skyttä ville.skytta@iki.fi Index: httpd/cgi-bin/check =================================================================== RCS file: /sources/public/validator/httpd/cgi-bin/check,v retrieving revision 1.200.2.17 diff -a -u -r1.200.2.17 check --- httpd/cgi-bin/check 2002/03/31 07:04:07 1.200.2.17 +++ httpd/cgi-bin/check 2002/03/31 16:49:39 @@ -100,7 +100,8 @@ $File->{'Footer'} = &prepSSI({ File => $CFG->{'Footer'}, Date => q$Date: 2002/03/31 07:04:07 $, - }); + }) + . " </body>\n</html>\n"; # # Prepare standard HTML preamble for output. @@ -277,16 +278,18 @@ # Print different things if we got redirected or had a file upload. #if (URI::eq("$File->{Opt}->{URI}", $q->param('uri'))) { # @@@FIXME@@@: Temorarily Broken. :-) if (TRUE) { # @@ Need to stringify here? + my $size = (length($File->{Opt}->{URI}) || 38) + 2; + $size = 70 if $size > 70; &add_table($File, qq(<label title="Address of Page to Validate" for="uri">Address</label>), - '<input type="text" id="uri" name="uri" size="' - . (length($File->{Opt}->{URI}) + 2) + '<input type="text" id="uri" name="uri" size="' . $size . '" value="' . $File->{Opt}->{URI} . '" />'); } elsif ($q->param('uploaded_file')) { &add_table($File, "File", $File->{Opt}->{URI}); } else { + my $size = (length($File->{Opt}->{URI}) || 38) + 2; + $size = 70 if $size > 70; &add_table($File, qq(<label title="Address of Page to Validate" for="uri"><a href="http://www.w3.org/Addressing/#terms">URI</a></label>), - '<input type="text" id="uri" name="uri" size="' - . (length($File->{Opt}->{URI}) + 2) + '<input type="text" id="uri" name="uri" size="' . $size . '" value="' . $File->{Opt}->{URI} . '" />' . ' [<a href="' . $File->{Opt}->{URI} . '">Go to URI</a>]'); &add_warning($File, '<em>Note:</em> The URI you gave me, «<code>' . @@ -300,7 +303,7 @@ &add_table($File, "Size", $File->{Size}) if $File->{Size}; unless ($File->{Opt}->{'Is Upload'}) { &add_table($File, - qq(<label title="Character Encoding" for="charset">Character Encoding</label>), + qq(<label title="Character Encoding" for="charset">Character Encoding</label>), $File->{Use_Charset} . ' ' . &CGI::popup_menu(-name => 'charset', -id => 'charset', -values => [ @@ -378,7 +381,7 @@ To assure correct validation, processing, and display, it is important that the character encoding is properly labeled. - <a href='http://www.w3.org/International/O-charset.html'>Further + <a href="http://www.w3.org/International/O-charset.html">Further explanations</a>. EOHD $File->{Tentative} |= T_CHARSET_KLUDGE; # WOuld be T_WARN, but the complaints... @@ -489,7 +492,7 @@ } my @cmd = ($CFG->{'SGML Parser'}, '-c', $catalog, '-E0', @xmlflags); -&add_table($File, "commandline", "<code>@cmd</code>") if $DEBUG; +&add_table($File, "Command Line", "<code>@cmd</code>") if $DEBUG; # # Temporary filehandles. @@ -553,7 +556,6 @@ my $fpi; -$File->{Version} = 'unknown'; if ($File->{Type} eq 'xhtml' or $File->{Type} eq 'mathml' or $File->{Type} eq 'svg' or $File->{Type} eq 'smil') { $fpi = $File->{DOCTYPE}; } elsif ($File->{Type} eq 'xml') { @@ -569,7 +571,7 @@ } $File->{Version} = $CFG->{'FPI to Text'}->{$fpi} || 'unknown'; -&add_table($File, "Document Type", +&add_table($File, "<label for=\"doctype\">Document Type</label>", $File->{Version} . ' ' . &CGI::popup_menu( -name => 'doctype', @@ -594,25 +596,25 @@ &add_warning($File, "Unknown namespace («<code>$File->{Namespace}</code>») for text/html document!"); if ($File->{Namespace} ne '') { &add_table($File, "Root Namespace", - "<a href='$File->{Namespace}'>$File->{Namespace}</a>"); + "<a href=\"$File->{Namespace}\">$File->{Namespace}</a>"); } } elsif ($File->{Type} eq 'svg' and $File->{Namespace} ne 'http://www.w3.org/2000/svg') { &add_warning($File, "Unknown namespace («<code>$File->{Namespace}</code>») for SVG document!"); if ($File->{Namespace} ne '') { &add_table($File, "Root Namespace", - "<a href='$File->{Namespace}'>$File->{Namespace}</a>"); + "<a href=\"$File->{Namespace}\">$File->{Namespace}</a>"); } } else { if ($File->{Namespace} ne '') { &add_table($File, "Root Namespace", - "<a href='$File->{Namespace}'>$File->{Namespace}</a>"); + "<a href=\"$File->{Namespace}\">$File->{Namespace}</a>"); } } if (scalar keys %{$File->{Namespaces}} > 1) { my $namespaces = "<ul>"; for (keys %{$File->{Namespaces}}) { - $namespaces .= "\t<li><a href='$_'>$_</a></li>\n" + $namespaces .= "\t<li><a href=\"$_\">$_</a></li>\n" unless $_ eq $File->{Namespace}; # Don't repeat Root Namespace. } &add_table($File, "Other Namespaces", $namespaces . "</ul>"); @@ -704,13 +706,13 @@ my $tableEntry; unless ($File->{Opt}->{URI} =~ m(^upload://)) { add_table($File, '<input type="submit" value="Revalidate" />', - ' <label title="Show Page Source"><input type="checkbox" value="" name="ss"' . + ' <label class="nowrap" title="Show Page Source"><input type="checkbox" value="" name="ss"' . ($File->{Opt}->{'Show Source'} ? 'checked="checked"' : '') . " />Show Source</label>\n" . - ' <label title="Show an Outline of the document"><input type="checkbox" value="" name="outline"' . + ' <label class="nowrap" title="Show an Outline of the document"><input type="checkbox" value="" name="outline"' . ($File->{Opt}->{'Outline'} ? 'checked="checked"' : '') . " />Outline</label>\n" . - ' <label title="Show Parse Tree"><input type="checkbox" value="" name="sp"' . + ' <label class="nowrap" title="Show Parse Tree"><input type="checkbox" value="" name="sp"' . ($File->{Opt}->{'Show Parsetree'} ? 'checked="checked"' : '') . " />Parse Tree</label>\n" . - ' <label title="Exclude Attributes from Parse Tree"><input type="checkbox" value="" name="noatt"' . + ' <label class="nowrap" title="Exclude Attributes from Parse Tree"><input type="checkbox" value="" name="noatt"' . ($File->{Opt}->{'No Attributes'} ? 'checked="checked"' : '') . " />...no attributes</label>\n" ); } @@ -718,11 +720,11 @@ unless $File->{Opt}->{URI} =~ m(^upload://); print qq(<table class="header">\n); unless ($File->{'Is Valid'}) { - &add_table($File, 'Errors', scalar(@{$File->{Errors}}) . ' (aproximate)'); + &add_table($File, 'Errors', scalar(@{$File->{Errors}}) . ' (approximate)'); } for $tableEntry (@{$File->{Table}}) { print " <tr>\n"; - print ' ' x 6, "<th>", $$tableEntry{Head}, ": </th>\n"; + print ' ' x 6, "<th class=\"nowrap\">", $$tableEntry{Head}, ": </th>\n"; print ' ' x 6, "<td>", $$tableEntry{Tail}, "</td>\n"; print " </tr>\n"; } @@ -759,7 +761,7 @@ </p> <pre> - <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN"> + <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <title>Title</title> @@ -780,7 +782,7 @@ my $File = shift; my ($dieMessage) = shift; print <<"EOF"; - <hr> + <hr /> <strong class="error">Internal server error ($dieMessage).</strong> Please contact <a href="mailto:$CFG->{Maintainer}">maintainer</a>. EOF @@ -836,7 +838,7 @@ <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> -<html> +<html lang="en"> <head><title>401 Authorization Required</title></head> <body> <h1>Authorization Required</h1> @@ -1315,7 +1317,7 @@ print qq(<span class="markup">^</span></pre>\n); } print "</ul>\n"; - print "<hr>\n"; + print "<hr />\n"; if ($File->{Version} eq 'unknown') { print "<p>Sorry, I can't validate this document.</p>"; } elsif ($File->{Type} eq 'xml') { @@ -1396,7 +1398,7 @@ if (defined $image_uri) { print <<"EOHD"; <p> - <img src="$image_uri" alt="$alttext"$gifhw> Congratulations, this + <img src="$image_uri" alt="$alttext"$gifhw /> Congratulations, this document validates as $File->{Version}! </p> @@ -1470,7 +1472,7 @@ print <<'EOF'; <div id="outline" class="mtb"> - <hr> + <hr /> <h2><a name="outline">Outline</a></h2> <p> Below is an outline for this document, automatically generated from the @@ -1538,7 +1540,7 @@ print <<'EOF'; <div id="source" class="mtb"> - <hr> + <hr /> <h2><a name="source">Source Listing</a></h2> <p>Below is the source input I used for this validation:</p> @@ -1560,7 +1562,7 @@ print <<'EOF'; <div id="parse" class="mtb"> - <hr> + <hr /> <h2><a name="parse">Parse Tree</a></h2> EOF if ($File->{Opt}->{'No Attributes'}) { @@ -1687,7 +1689,7 @@ sub show_esis ($) { print <<'EOF'; <div id="raw_esis" class="mtb"> - <hr> + <hr /> <h2><a name="raw_esis">Raw ESIS Output</a></h2> <pre> EOF @@ -1704,7 +1706,7 @@ sub show_errors ($) { print <<'EOF'; <div id="raw_errors" class="mtb"> - <hr> + <hr /> <h2><a name="raw_errors">Raw Error Output</a></h2> <pre> EOF Index: htdocs/base.css =================================================================== RCS file: /sources/public/validator/htdocs/base.css,v retrieving revision 1.20.2.5 diff -a -u -r1.20.2.5 base.css --- htdocs/base.css 2002/03/31 07:04:07 1.20.2.5 +++ htdocs/base.css 2002/03/31 16:49:39 @@ -45,6 +45,10 @@ .hideme {display: none} +.nowrap { + white-space: nowrap; +} + #Notice { width: 75ex; border: solid; @@ -139,7 +143,7 @@ } /* Various header(ish) things. Definitions cribbed from the CORE Styles. */ -h1 { +h1 { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 2em; @@ -151,8 +155,8 @@ text-transform: none; color: #005A9C; } - -h2 { + +h2 { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 1.75em; @@ -162,9 +166,9 @@ word-spacing: normal; letter-spacing: normal; text-transform: none; - } - -h3 { + } + +h3 { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 1.58em; @@ -174,9 +178,9 @@ word-spacing: normal; letter-spacing: normal; text-transform: none; - } - -h4 { + } + +h4 { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 1.33em; @@ -186,9 +190,9 @@ word-spacing: normal; letter-spacing: normal; text-transform: none; - } - -h5, dt { + } + +h5, dt { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 1.17em; @@ -199,9 +203,9 @@ letter-spacing: normal; text-transform: none; margin-top: 1em; - } - -h6 { + } + +h6 { font-family: Tahoma, Verdana, Myriad Web, Syntax, sans-serif; font-size-adjust: .53; font-size: 1em; @@ -211,6 +215,6 @@ word-spacing: normal; letter-spacing: normal; text-transform: none; - } + } Index: httpd/cgi-bin/check =================================================================== RCS file: /sources/public/validator/httpd/cgi-bin/check,v retrieving revision 1.200.2.17 diff -a -u -r1.200.2.17 check --- httpd/cgi-bin/check 2002/03/31 07:04:07 1.200.2.17 +++ httpd/cgi-bin/check 2002/03/31 13:52:37 @@ -34,8 +34,8 @@ use IO::File; use Text::Iconv; # on debian: apt-get install libtext-iconv-perl use HTML::Parser 3.25; # Need 3.25 for $p->ignore_elements. +use File::Spec (); - ############################################################################### #### Constant definitions. #################################################### ############################################################################### @@ -454,32 +454,33 @@ } my @xmlflags = '-wnon-sgml-char-ref'; -my $catalog = $CFG->{'SGML Library'} . '/catalog'; +my $catalog = File::Spec->catfile($CFG->{'SGML Library'}, 'catalog'); if ($File->{Type} eq 'xhtml') { - $catalog = $CFG->{'SGML Library'} . '/xhtml.soc'; + $catalog = File::Spec->catfile($CFG->{'SGML Library'}, 'xhtml.soc'); $ENV{SP_CHARSET_FIXED} = 'YES'; $ENV{SP_ENCODING} = 'UTF-8'; @xmlflags = '-wxml'; } elsif ($File->{Type} eq 'svg') { - $catalog = $CFG->{'SGML Library'} . '/svg.soc'; + $catalog = File::Spec->catfile($CFG->{'SGML Library'}, 'svg.soc'); $ENV{SP_CHARSET_FIXED} = 'YES'; $ENV{SP_ENCODING} = 'UTF-8'; @xmlflags = '-wxml'; } elsif ($File->{Type} eq 'smil') { - $catalog = $CFG->{'SGML Library'} . '/smil.soc'; + $catalog = File::Spec->catfile($CFG->{'SGML Library'}, 'smil.soc'); $ENV{SP_CHARSET_FIXED} = 'YES'; $ENV{SP_ENCODING} = 'UTF-8'; @xmlflags = '-wxml'; } elsif ($File->{Type} eq 'mathml') { - $catalog = $CFG->{'SGML Library'} . '/mathml.soc'; + $catalog = File::Spec->catfile($CFG->{'SGML Library'}, 'mathml.soc'); $ENV{SP_CHARSET_FIXED} = 'NO'; $ENV{SP_ENCODING} = 'XML'; @xmlflags = '-wxml'; } elsif ($File->{Type} eq 'xml' or $File->{Namespace}) { # no doctype, with xmlns attr on 1st element $File->{Type} = 'xml'; # @@ probably a better way to do this - $catalog = $CFG->{'SGML Library'} . '/sp-1.3/pubtext/xml.soc'; + $catalog = File::Spec->catfile($CFG->{'SGML Library'}, + 'sp-1.3', 'pubtext', 'xml.soc'); $ENV{SP_CHARSET_FIXED} = 'YES'; $ENV{SP_ENCODING} = 'XML'; @xmlflags = '-wxml'; Index: httpd/cgi-bin/check =================================================================== RCS file: /sources/public/validator/httpd/cgi-bin/check,v retrieving revision 1.200.2.17 diff -a -u -r1.200.2.17 check --- httpd/cgi-bin/check 2002/03/31 07:04:07 1.200.2.17 +++ httpd/cgi-bin/check 2002/03/31 15:45:33 @@ -24,7 +24,8 @@ ############################################################################### use strict; -use LWP::UserAgent; +use LWP::UserAgent 1.90 (); # Need 1.90 for protocols_(allowed|forbidden) +use HTTP::Request (); use URI; use URI::Escape; use CGI::Carp; @@ -147,13 +148,7 @@ } elsif ($q->param('fragment')) { $File = &handle_frag($q, $File); } elsif ($q->param('uri')) { - my $rejected = &uri_rejected($File); - if ($rejected) { - $File->{'Error Flagged'} = TRUE; - $File->{'Error Message'} = $rejected; - } else { - $File = &handle_uri($q, $File); - } + $File = &handle_uri($q, $File); } # @@ -186,7 +181,7 @@ } # -# Abort if an error is flagged. (only from &uri_rejected() so far). +# Abort if an error is flagged. (only from &handle_uri() so far). if ($File->{'Error Flagged'}) { print $File->{'Results'}; print $File->{'Error Message'}; @@ -984,9 +979,14 @@ next if /^\s*$/; next if /^\s*\#/; chomp; - my($k, $v) = split /\t+/, $_; + my($k, $v) = split(/\t+/, $_, 2); $v = '' unless defined $v; - $v = &read_cfg($v) if $v =~ s(^file://){}; + if ($v =~ s(^file://){}) { + $v = &read_cfg($v); + } elsif ($v =~ /,/) { + my @vals = split(/,/, $v); + $v = \@vals; + } $cfg{$k} = $v; } undef $fh; @@ -1002,11 +1002,25 @@ my $q = shift; # The CGI object. my $File = shift; # The master datastructure. - my $uri = $q->param('uri'); # The URI to fetch. + my $uri = URI->new($q->param('uri')); # The URI to fetch. my $ua = new LWP::UserAgent; $ua->agent("W3C_Validator/$VERSION " . $ua->agent); $ua->parse_head(0); # Parse the http-equiv stuff ourselves. @@ Why? + + # @@@FIXME@@@: + # Disable checking if the URI is local (or private) for security reasons, + # or at least make it configurable to do so. + # eg. /^(localhost(\.localdomain)?|127\..+)$/ (+ private networks) + # + $ua->protocols_allowed($CFG->{'Allowed Protocols'} || ['http', 'https']); + + if (! $ua->is_protocol_supported($uri)) { + $File->{'Error Flagged'} = TRUE; + $File->{'Error Message'} = &uri_rejected($uri); + return $File; + } + my $req = new HTTP::Request(GET => $uri); # If we got a Authorization header, the client is back at it after being @@ -1022,7 +1036,8 @@ &authenticate($File, $res->request->url, $res->www_authenticate); } else { print $File->{Results}; - &print_unknown_http_error_message($uri, $res->code, $res->message); + &print_unknown_http_error_message($uri->as_string, $res->code, + $res->message); } print $File->{'Footer'}; exit; @@ -1774,38 +1789,13 @@ } -# -# Check if the URI looks like one we want to retrieve. # -# Returns FALSE if URI is ok, or an error message suitable for output. -# -# @@@FIXME@@@: -# Disable checking if the URI is local (or private) for security reasons, -# or at least make it configurable to do so. -# eg. /^(localhost(\.localdomain)?|127\.)$/ (+ private networks) -# +# Output errors for a rejected URI. sub uri_rejected { - my $File = shift; - my $uri = URI->new($File->{Opt}->{URI}); - my $scheme = $uri->scheme(); - - my($handler, $ret); - - # - # If we don't have an SSL implementation for use with LWP, CGI::Carp bombs - # here and the script dies. So, we need to "local"ize $SIG{__DIE__} here. - eval {local $SIG{__DIE__}; $handler = LWP::Protocol::implementor($scheme)}; - - # No implementor for the scheme, or a (disabled) "file://" URI. - if ($@ or $scheme eq 'file' or $handler eq 'LWP::Protocol::file') { - undef $handler; - } + my $uri = shift; + my $scheme = $uri ? $uri->scheme() || 'undefined' : 'undefined'; - if (defined $handler) { - return FALSE; - } else { - $scheme = $scheme || 'undefined'; - $ret = <<"EOF"; + return <<"EOF"; <div class="error"> <p> Sorry, this type of <a @@ -1824,6 +1814,4 @@ </p> </div> EOF - return $ret; - } } Index: htdocs/config/check.cfg =================================================================== RCS file: /sources/public/validator/htdocs/config/Attic/check.cfg,v retrieving revision 1.1.2.4 diff -a -u -r1.1.2.4 check.cfg --- htdocs/config/check.cfg 2002/03/30 22:45:45 1.1.2.4 +++ htdocs/config/check.cfg 2002/03/31 15:45:33 @@ -58,3 +58,7 @@ Header /usr/local/validator/htdocs/header.html Footer /usr/local/validator/htdocs/footer.html + +# +# Allowed protocols, comma-separated. +Allowed Protocols http,https Index: httpd/cgi-bin/checklink.pl =================================================================== RCS file: /sources/public/validator/httpd/cgi-bin/checklink.pl,v retrieving revision 2.89 diff -a -u -r2.89 checklink.pl --- httpd/cgi-bin/checklink.pl 2002/02/01 21:29:09 2.89 +++ httpd/cgi-bin/checklink.pl 2002/03/31 20:14:22 @@ -1,4 +1,4 @@ -#! /usr/bin/perl -w +#!/usr/bin/perl -w # # W3C Link Checker # by Hugo Haas <hugo@w3.org> @@ -104,7 +104,7 @@ printf("\n%s\n", &global_stats()); } } else { - use CGI; + use CGI (); use CGI::Carp qw(fatalsToBrowser); $query = new CGI; # Set a few parameters in CGI mode @@ -272,7 +272,7 @@ for example, it would be: http://www.w3.org/TR/html4/ -n/--noacclanguage Do not send an Accept-Language header. - -L/--languages Languages accepted (default: '$_languages'). + -L/--languages Languages accepted (default: '$_languages'). -q/--quiet No output if no errors are found. -v/--verbose Verbose mode. -i/--indicator Show progress while parsing. @@ -301,7 +301,7 @@ sub ask_password() { print(STDERR 'Enter the password for user '.$_user.': '); - # Will only work on Unix... + # Will only work on Unix... Term::ReadKey from CPAN would be better. system('stty -echo'); chomp($_password = <STDIN>); system('stty echo'); @@ -315,7 +315,7 @@ ########################################### sub urize() { - use URI; + use URI (); $_ = URI::Escape::uri_unescape($_[0]); my $base; my $res = $_; @@ -375,7 +375,7 @@ if ($_html) { print("</h2>\n"); if (! $_summary) { - use URI; + use URI::Escape (); printf("<p>Go to <a href='#%s'>the results</a>.</p>\n", $result_anchor); printf("<p>Check also: @@ -393,7 +393,7 @@ $processed{$absolute_uri} = 1; # Parse the document my $p = &parse_document($uri, $absolute_uri, - $response->content(), 1, + $response->content(), 1, $depth != 0); my $base = URI->new($p->{base}); @@ -689,7 +689,7 @@ sub W3C::UserAgent::redirect_ok { my ($self, $request) = @_; - + if (! ($_summary || (!$doc_count && $_html))) { &hprintf("\n%s %s ", $request->method(), $request->uri()); } @@ -775,7 +775,7 @@ print "\n"; } return &get_uri($method, $response->request->url, - $start, $ua->{Redirects}, + $start, $ua->{Redirects}, $code, $realm, $message, 1); } # Record the redirects @@ -1077,7 +1077,6 @@ ################################ sub check_validity() { - use HTTP::Status; my ($testing, $uri, $links, $redirects) = @_; # $testing is the URI of the document checked # $uri is the URI of the target that we are verifying @@ -1163,10 +1162,10 @@ } sub escape_match($, \%) { - use URI::Escape; - my ($a, $hash) = (uri_unescape($_[0]), $_[1]); + use URI::Escape (); + my ($a, $hash) = (URI::Escape::uri_unescape($_[0]), $_[1]); foreach $b (keys %$hash) { - if ($a eq uri_unescape($b)) { + if ($a eq URI::Escape::uri_unescape($b)) { return(1); } } @@ -1181,15 +1180,15 @@ my $r = $_[0]; $r->headers->www_authenticate =~ /Basic realm=\"([^\"]+)\"/; my $realm = $1; - my $authHeader = $r->headers->www_authenticate; + my $authHeader = $r->headers->www_authenticate; if ($_cl) { printf(STDERR "\nAuthentication is required for %s.\n", $r->request->url); printf(STDERR "The realm is %s.\n", $realm); print(STDERR "Use the -u and -p options to specify a username and password.\n"); } else { - printf("Status: 401 Authorization Required\nWWW-Authenticate: %s\nConnection: close\nContent-Type: text/html\n\n", $r->headers->www_authenticate); + printf("Status: 401 Authorization Required\nWWW-Authenticate: %s\nConnection: close\nContent-Language: en\nContent-Type: text/html\n\n", $r->headers->www_authenticate); printf("<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"> -<html> +<html lang=\"en\"> <head> <title>401 Authorization Required</title> </head> @@ -1207,7 +1206,7 @@ ################## sub get_timestamp() { - use Time::HiRes; + use Time::HiRes (); return pack('LL', Time::HiRes::gettimeofday()); } @@ -1359,7 +1358,7 @@ $whattodo = 'You must change this link: people using a browser without Javascript support will <em>not</em> be able to follow this link. See the -<a href="http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/#tech-scripts">Web +<a href="http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/#tech-scripts">Web Content Accessibility Guidelines on the use of scripting on the Web</a> and the <a href="http://www.w3.org/TR/WCAG10-HTML-TECHS/#directly-accessible-scripts">techniques @@ -1409,7 +1408,7 @@ } printf(" <dt%s>%s</dt> -<dd>What to do: <strong%s>%s</strong>%s<br> +<dd>What to do: <strong%s>%s</strong>%s<br></dd> <dd>HTTP Code returned: %d%s<br> HTTP Message: %s%s%s</dd> <dd>Lines: %s</dd>\n", @@ -1439,7 +1438,7 @@ # HTTP original message defined($results->{$u}{location}{orig_message}) ? &encode($results->{$u}{location}{orig_message}). - ' <span title="redirected to">-></span> ' + ' <span title="redirected to">-></span> ' : '', # HTTP final message $http_message, @@ -1504,7 +1503,7 @@ } } } - # End of the table + # End of the table if ($_html) { print("</dl>\n"); } @@ -1663,7 +1662,7 @@ my $stop = &get_timestamp(); return sprintf("Checked %d document(s) in %ss.", ($doc_count<=$_max_documents? $doc_count : $_max_documents), - &time_diff($timestamp, $stop)); + &time_diff($timestamp, $stop)); } ################## @@ -1683,7 +1682,7 @@ print " <!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"> -<html> +<html lang=\"en\"> <head> <title>W3C".$title."</title> <style type=\"text/css\"> @@ -1694,8 +1693,8 @@ background: white; } -pre { - font-family: monospace +pre, code, tt { + font-family: monospace; } img { @@ -1730,7 +1729,7 @@ </style> </head> <body> -<a href=\"http://www.w3.org/\"><img alt=\"W3C\" src=\"http://www.w3.org/Icons/w3c_home\" height=\"48\" width=\"72\"></a> +<p><a href=\"http://www.w3.org/\" title=\"W3C\"><img alt=\"W3C\" src=\"http://www.w3.org/Icons/w3c_home\" height=\"48\" width=\"72\"></a></p> <h1>W3C<sup>®</sup>".$title."</h1> \n"; } @@ -1783,9 +1782,8 @@ <a href=\"http://www.w3.org/2000/07/checklink\">documentation</a>. Download the <a href=\"http://dev.w3.org/cvsweb/~checkout~/validator/httpd/cgi-bin/checklink.pl?rev=".$REVISION."&content-type=text/plain\">source -code</a> from the -<a href=\"http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/checklink.pl\">CVS -log</a>. +code</a> from +<a href=\"http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/checklink.pl\">CVS</a>. </address> </body> </html> @@ -1806,22 +1804,22 @@ my ($q) = @_; &html_header('', 1); print "<form action=\"".$q->self_url()."\" method=\"get\"> -<p>Enter the address (<a href='http://www.w3.org/Addressing/#terms'>URL</a>) -of a document that you would like to check:</p> -<p><input type=\"text\" size=\"50\" name=\"uri\"></p> +<p><label for=\"uri\">Enter the address (<a href='http://www.w3.org/Addressing/#terms'>URL</a>) +of a document that you would like to check:</label></p> +<p><input type=\"text\" size=\"50\" id=\"uri\" name=\"uri\"></p> <p>Options:</p> <p> - <input type=\"checkbox\" name=\"summary\"> Summary only + <label><input type=\"checkbox\" name=\"summary\"> Summary only</label> <br> - <input type=\"checkbox\" name=\"hide_redirects\"> Hide redirects + <label><input type=\"checkbox\" name=\"hide_redirects\"> Hide redirects</label> <br> - <input type=\"checkbox\" name=\"no_accept_language\"> Don't send <tt>Accept-Language</tt> headers. + <label><input type=\"checkbox\" name=\"no_accept_language\"> Don't send <tt>Accept-Language</tt> headers</label> <br> - <input type=\"checkbox\" name=\"hide_dir_redirects\"> Hide directory redirects + <label><input type=\"checkbox\" name=\"hide_dir_redirects\"> Hide directory redirects</label> <br> - <input type=\"checkbox\" name=\"recursive\"> Check linked documents recursively <small>(maximum: $_max_documents documents; sleeping $_sleep_time seconds between each document)</small> + <label><input type=\"checkbox\" name=\"recursive\"> Check linked documents recursively <small>(maximum: $_max_documents documents; sleeping $_sleep_time seconds between each document)</small></label> <br> - Depth of the recursion: <input type=\"text\" size=\"3\" name=\"depth\"><small>(-1 is the default and means unlimited)</small> + <label>Depth of the recursion: <input type=\"text\" size=\"3\" name=\"depth\"><small>(-1 is the default and means unlimited)</small></label> </p> <p><input type=\"submit\" name=\"submit\" value=\"Check\"></p> </form> Index: httpd/cgi-bin/checklink.pl =================================================================== RCS file: /sources/public/validator/httpd/cgi-bin/checklink.pl,v retrieving revision 2.89 diff -a -u -r2.89 checklink.pl --- httpd/cgi-bin/checklink.pl 2002/02/01 21:29:09 2.89 +++ httpd/cgi-bin/checklink.pl 2002/03/29 17:32:50 @@ -84,12 +84,12 @@ if ($#ARGV >= 0) { $_cl = 1; # Parse command line - my @uris = &parse_arguments(); + &parse_arguments(); if ($_user && (! $_password)) { &ask_password(); } my $uri; - foreach $uri (@uris) { + foreach $uri (@ARGV) { if (!$_summary) { printf("%s %s\n", $PROGRAM ,$VERSION) if (! $_html); } else { @@ -163,95 +163,44 @@ ################################ sub parse_arguments() { - my @uris; - my $uris = 0; - while (@ARGV) { - $_ = shift(@ARGV); - if ($uris) { - push(@uris, $_); - } elsif (m/^--$/) { - $uris = 1; - } elsif (m/^-[^-DupytdlL]/) { - if (m/q/) { - $_quiet = 1; - $_summary = 1; - } - if (m/s/) { - $_summary = 1; - } - if (m/b/) { - $_redirects = 0; - } - if (m/e/) { - $_dir_redirects = 0; - } - if (m/v/) { - $_verbose = 1; - } - if (m/i/) { - $_progress = 1; - } - if (m/h/) { - $_html = 1; - } - if (m/n/) { - $_accept_language = 0; - } - if (m/r/) { - if ($_depth == 0) { - $_depth = -1; - } - } - } elsif (m/^--help$/) { - &usage(); - } elsif (m/^--quiet$/) { - $_quiet = 1; - } elsif (m/^--summary$/) { - $_summary = 1; - } elsif (m/^--broken$/) { - $_redirects = 0; - } elsif (m/^--dir-redirects$/) { - $_dir_redirects = 0; - } elsif (m/^--verbose$/) { - $_verbose = 1; - } elsif (m/^--indicator$/) { - $_progress = 1; - } elsif (m/^--html$/) { - $_html = 1; - } elsif (m/^--noacclanguage$/) { - $_accept_language = 0; - } elsif (m/^--recursive$/) { - if ($_depth == 0) { - $_depth = -1; - } - } elsif (m/^-l|--location$/) { - $_base_location = shift(@ARGV); - } elsif (m/^-u|--user$/) { - $_user = shift(@ARGV); - } elsif (m/^-p|--password$/) { - $_password = shift(@ARGV); - } elsif (m/^-t|--timeout$/) { - $_timeout = shift(@ARGV); - } elsif (m/^-L|--languages$/) { - $_languages = shift(@ARGV); - } elsif (m/^-D|--depth$/) { - my $value = shift(@ARGV); - $_depth = $value unless($value == 0); - } elsif (m/^-d|--domain$/) { - $_trusted = shift(@ARGV); - } elsif (m/^-y|--proxy$/) { - $_http_proxy = shift(@ARGV); - } elsif (m/^--masquerade$/) { - $_masquerade = 1; - $_local_dir = shift(@ARGV); - $_remote_masqueraded_uri = shift(@ARGV); - } elsif (m/^--hide-same-realm$/) { - $_hide_same_realm = 1; - } else { - push(@uris, $_); - } + + use Getopt::Long 2.17 qw(GetOptions); + Getopt::Long::Configure('no_ignore_case'); + my @masq = (); + + GetOptions('help' => \&usage, + 'q|quiet' => sub { $_quiet = 1; $_summary = 1; }, + 's|summary' => \$_summary, + 'b|broken' => sub { $_redirects = 0; }, + 'e|dir-redirects' => sub { $_dir_redirects = 0; }, + 'v|verbose' => \$_verbose, + 'i|indicator' => \$_progress, + 'h|html' => \$_html, + 'n|noacclanguage' => sub { $_accept_language = 0; }, + 'r|recursive' => sub { $_depth = -1 if $_depth == 0; }, + 'l|location=s' => \$_base_location, + 'u|user=s' => \$_user, + 'p|password=s' => \$_password, + 't|timeout=i' => \$_timeout, + 'L|languages=s' => \$_languages, + 'D|depth=i' => sub { $_depth = $_[1] unless $_[1] == 0; }, + 'd|domain=s' => \$_trusted, + 'y|proxy=s' => \$_http_proxy, + 'masquerade' => \@masq, + 'hide-same-realm' => \$_hide_same_realm, + 'V|version' => \&version, + ); + + if (@masq) { + $_masquerade = 1; + $_local_dir = shift(@masq); + $_remote_masqueraded_uri = shift(@masq); } - return(@uris); +} + +sub version() { + print STDERR "$PROGRAM $VERSION\n"; + exit(0); } sub usage() { @@ -278,7 +227,7 @@ -i/--indicator Show progress while parsing. -u/--user username Specify a username for authentication. -p/--password password Specify a password. - --hide-same-real Hide 401's that are in the same realm as the + --hide-same-realm Hide 401's that are in the same realm as the document checked. -t/--timeout value Timeout for the HTTP requests. -d/--domain domain Regular expression describing the domain to @@ -290,6 +239,7 @@ -y/--proxy proxy Specify an HTTP proxy server. -h/--html HTML output. --help Show this message. + -V/--version Output version information. Documentation at: http://www.w3.org/2000/07/checklink Please send bug reports and comments to the www-validator mailing list:Received on Sunday, 31 March 2002 16:41:03 UTC
This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:32 UTC