W3C home > Mailing lists > Public > www-validator@w3.org > October 2001

Re: checklink:

From: Roy T. Fielding <fielding@ebuilt.com>
Date: Sat, 13 Oct 2001 20:39:56 -0700
To: Tim Bagot <tsb-w3-validator-0006@earth.li>
Cc: "www-validator@w3.org" <www-validator@w3.org>, uri@w3.org, Bjoern Hoehrmann <derhoermi@gmx.net>
Message-ID: <20011013203956.B1056@waka.ebuilt.net>
On Sat, Oct 13, 2001 at 06:43:09PM +0000, Tim Bagot wrote:
> At 2001-10-13T02:08+0200, Bjoern Hoehrmann wrote:-
> 
> > Hm, checklink relies on URI.pm, which actually implements RFC 1808:
> >
> > [...]
> >    Similarly, parsers must avoid treating "." and ".." as special when
> >    they are not complete components of a relative path.
> >
> >       /./g          = <URL:http://a/./g>
> >       /../g         = <URL:http://a/../g>
> > [...]
> >
> > Note that I may create '..' paths, thus http://www.example.org/../ may
> > actually point to some other resource than http://www.example.org/ I
> > can't see anything in RFC 2396 that states such URIs are invalid, I'm
> > not sure if this is what I should read out of 'considered to be in
> > error'. How would I then create URIs to such resources? Using %2E%2E
> > wouldn't work either, would it?

It would work fine for a browser, but a WWW server is likely to strip it
out and respond with a redirect to protect its own filesystem paths.
Interpreting the http path segments is left entirely to the server software.

> Such URIs are perfectly valid; some relative URI references one might
> derive from them are in error. Path segments of "." and ".." are special
> only in relative path references, so abolute URIs and absolute path
> references containing them would be fine. No, %2E%2E would not work, since
> escaping unreserved characters does not alter the semantics.
> 
> Appendix C (Examples of Resolving Relative URI References) of RFC 2396
> actually disagrees with Chapter 5 (Relative URI References), suggesting
> that the correct behaviour is that of RFC 1808, but pointing out that some
> implementations will instead drop those segments. Therefore the behaviour
> specified by RFC 1808 is the only consistent interpretation of RFC 2396
> after all, and so checklink / URI.pm should not be modified. What fun.

It doesn't disagree with section 5 -- it merely points out that not all
implementations are compliant with all aspects of the standard.  A link
checker should respond with a friendly message like "only a complete idiot
would rely on a link like this one".  Whether you make that an error or a
warning message is up to you.  I'd make it configurable.  Link checkers
are expected to be more rigorous than typical client software.

RFC 1808 should be chucked in the bin now that RFC 2396 replaces it.

....Roy
Received on Monday, 15 October 2001 13:36:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:00 GMT