Re: about relative URLs

  Hi Dave,

> Is the following a valid relative HTTP URL?
> 	http:/mypath/index.html

No.

> The intended meaning is to GET the resource "/mypath/index.html" from the
> current server (the one from which the page with this link was retrieved).
> 
> There has been some discussion regarding Mozilla
> (<http://bugzilla.mozilla.org/show_bug.cgi?id=34648>), which gets confused by
> this URL.  (That is, it doesn't do what I intended.)  RFC 1808 (Relative URLs)
> accepts them syntactically, then gives "an example algorithm" that would reject
> them (sect. 4):
[snip]
> RFC 2396, Sect. 5, implies that the example is not relative, via the negation of
> "A relative reference that does not begin with a scheme name or a slash
> character is termed a relative-path reference."  Section 5.2 more explicitly
> says (like RFC 1808):
> =======
>    3) If the scheme component is defined, indicating that the reference
>       starts with a scheme name, then the reference is interpreted as an
>       absolute URI and we are done.  Otherwise, the reference URI's
>       scheme is inherited from the base URI's scheme component.
> =======

The next paragraph after that is instructive:

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

> I guess I haven't been paying attention.  I have been using URLs like the
> example for years, and they have been accepted (by liberal programs?). 
> Furthermore, I don't see why the RFC so specifically declares all URLs with a
> scheme to be absolute (except that Sect. 4 says "[t]he syntax for relative URI
> is a shortened form of that for absolute URI, where some prefix of the URI is
> missing...".  It seems to me that an otherwise relative URL with an explicit
> scheme can be parsed unambiguously.

One of the authors will have to give you the details on the thoughts that
went behind this, but <scheme>:<abs_path> (ala http:/mypath/index.html)
is a valid absolute URI (not for http, but in general for URI's
following the generic syntax), and hence a generic syntax parser would
be unable to distinguish between relative and absolute URI's. There's no
reason to allow relative URI's of the form <scheme>:<abs_path> (other
than backwards compatibility), but there is for absolute URI's (e.g. the
file scheme doesn't need any host part).


  Cheers,

  Ronald

Received on Tuesday, 13 March 2001 22:36:55 UTC