Re: Base tag, relative vs. absolute URLs (was: Web archive processing)

In <200201152049.JAA153569@atlas.otago.ac.nz>, "Richard A. O'Keefe" <ok@atlas.otago.ac.nz> writes:
> I wrote:
>         > Also, what should I do to make Tidy complain if a page
>         > + does contain relative URIs,
>         > - does NOT have a <base> element in the <head>?
> 
> HTML Tidy does not confine itself to tidying up HTML.
> It *also* probides accessibility tips.

Relative vs. absolute URLs, or the presence of a <base> element, have 
nothing to do with accessibility or preventing broken links from failing.

Quite on the contrary, the <base> element is usually a bad idea because it 
limits the portability of a page: 

* Moving the page to another site (even from a development server to a
  production server) or just another directory breaks all
  links.

* A page with an http:// base does not display properly using https://

> It is also a collection of links that haven't broken YET (so a link
> checker probably wouldn't notice any problem), but are just waiting
> for a chance to happen, and that's a usability problem.

Absolute links break as often as relative links -- if the content goes away it 
does not make a difference if you referenced it as ../greatoffer.html or 
http://www.domain.com/greatoffer.html.

> It's the kind of thing Tidy *could* detect quite easily,
> and it's very much in the spirit of things that Tidy *does* detect
> and warn about.

No -- Tidy detects problems in and corrects the structure of markup, not the 
content. Besides there is no way to tell from a file where it will be hosted.

> Note that I did not ask for Tidy to report this as a fatal error,
> or to change the resulting HTML in any way, simply that I'd like it to
> "complain", meaning to write a warning message (if the user requested it).

If you want to enforce a certain style you could validate against a custom DTD 
requiring a <base> element, or for attribute values against an XML schema which
only allows absolute protocols.

There is nothing to complain about relative URLs or lack of <base> elements.

-- 
Klaus Johannes Rusch
KlausRusch@atmedia.net
http://www.atmedia.net/KlausRusch/

Received on Tuesday, 15 January 2002 17:28:11 UTC