W3C home > Mailing lists > Public > www-validator@w3.org > September 2011

Re: checklink: base href not taken into account

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Thu, 15 Sep 2011 12:06:31 +0300
Message-ID: <4E71C017.5070303@cs.tut.fi>
To: www-validator@w3.org, charles.greathouse@case.edu
14.9.2011 23:55, Charles Greathouse wrote:

> Checklink appears not to take a document's base URL into account.

I can confirm that there indeed is a bug here. Checklink resolves 
relative URLs using the page URL as the base, irrespective of the use of 
a <base href=...> element. Simple demo:

> I looked through the source and I couldn't find an attempt to handle
> this, so I guess this is a feature recommendation rather than a
> bugfix.  The code has a comment
> # base/@href intentionally not checked
> though this seems to refer to checking the link in <base>  rather than
> using base to get the document's base location.

The comment may relate to the discussion
As it says "The link checker does compute the base URI properly, and 
reports all other tests (including tests related to base URI) properly", 
I suspect that the bug has crept in recently.

Regarding the checking of the URL in <base href=...>, I think it should 
be done unless there is a compelling technical reason against it. 
Formally, the URL there is not to be ever used as such (only as a base 
when resolving relative URLs). But formally, it is not an error to have, 
say, a link that does not refer to any resource but causes a 4xx or 5xx 
response (at least part of the time). Link checking is about practical 
issues, mostly not formal. And practically, it is useful to use a base 
URL that works by itself too - one reason to that is that the URL has a 
documentary value. (When I locally save a web page to study it in 
different modifications, I usually slap in a <base href> that refers to 
the original address, even though I know that anything past the last "/" 
is ignored.)

Yucca, http://www.cs.tut.fi/~jkorpela/
Received on Thursday, 15 September 2011 09:07:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:18:04 UTC