Re: URL-Reference / "empty URL" question

Klaus Weide (kweide@tezcat.com)
Wed, 14 May 1997 13:22:19 -0500 (CDT)


Date: Wed, 14 May 1997 13:22:19 -0500 (CDT)
From: Klaus Weide <kweide@tezcat.com>
To: Larry Masinter <masinter@parc.xerox.com>
cc: uri@bunyip.com
Subject: Re: URL-Reference / "empty URL" question
In-Reply-To: <3379DED3.424F@parc.xerox.com>
Message-ID: <Pine.SUN.3.95.970514115036.19055A-100000@huitzilo.tezcat.com>

On Wed, 14 May 1997, Larry Masinter wrote:

> The way I think of it myself is that when you interact with
> a resource on the network ("http://whatever.com/blah")
> or even a local file ("file:///c|/downloaded/blah") and
> recieve some content from that resource and are viewing
> it, the viewer has another implicit resource:
> 
>   "the copy of the content that is being viewed now"
> 
> Let's give it its own URL scheme
>    "this:"
> where the scheme-specific part of "this:" is
> empty. Then what we want to assert is that URL references
> of the form "#xxxx" are *not* relative
> to the BASE at all, they're always relative to
> to "this:". That is,
> 
>     <A HREF="#blarg">...</A>
> 
> is equivalent to
> 
>     <A HREF="this:#blarg">...</A>
> 
> and different from
> 
>     <A HREF="file://localhost/download/blah#blarg">...</A>

I find this interpretation is also backed by RFC 1866:

7.1. Accessing Resources
   Once the address of the head anchor is determined, the user agent may
   obtain a representation of the resource.
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

7.2. Activation of Hyperlinks
   To activate a link, the user agent obtains a representation of the
   resource identified in the address of the head anchor.  If the
   representation is another HTML document[...]
   ^^^^^^^^^^^^^^^^^^             ^^^^^^^^

7.4. Fragment Identifiers
   Any characters following a `#' character in a hypertext address
   constitute a fragment identifier. In particular, an address of the
   form `#fragment' refers to an anchor in the same document.
                                        ^^^^^^^^^^^^^^^^^^^^

   The meaning of fragment identifiers depends on the media type of the
   representation of the anchor's resource.

   For example, [example of URL reference with non-empty URL and fragment,]
   Then the user agent accesses the resource identified by
   `http://host/x/app1.html'. Assuming the resource is represented using
   the `text/html' media type, the user agent must locate the <A>
   element whose NAME attribute is `bananas' and begin navigation there.

These formulations seem carefully chosen to make a distinction between
"resource" (which is identified by a URI) and "document" (which does
not itself have a URI/URL, but is a "representation" of a resource which
does).  With this understanding, an "anchor in the same document" is not
(necessarily) the same as "an anchor in a document representing the same
resource".

In this sense, the current draft doesn't say something completely new, but
goes back to HTML 2.0.  But it *is* rather different from RFC1808...

> I know many browsers have built-in URLs for some operations
> ("about:", "globalhistory:") and vaguely remember some
> reference to "back:" and "forward:" somewhere; but what
> we need to explain "#blarg" is "this:".

This "this:" is similar to how I have tried to explain to myself what the
draft is saying.  I think it is useful for putting the difference in an
explicit form so that it becomes obvious, but not for really "using" it
(show to the user, save to files etc.) - it can only be an internal thing
within an application.  It is only meaningful in a specific context: while
"this document" is the current document from which the HREF was taken.
It would have to be translated to a "normal" URL-Reference to be
meaningful outside of that context.

The last paragraph of section 3 has a specific provision for when a
translation to a "normal" URL[-Reference] should occur:

   However, if the URL reference occurs in a context that is always
   intended to result in a new request, as in the cases of HTML's
   FORM "action" attribute and IMG "src" attribute [RFC1866], then
   an empty URL reference represents the URL of the current document [...]

I think this should be broadened so that it applies not just to
certain "contexts" defined by the information provider/page author, but
also to actions taken by the client/requested by a user.  For example, by
adding a sentence:

   An empty URL in a URL reference should also be replaced by the URL of
   the current document when it is used for a purpose that requires a URL
   or when it is transferred out of the context of the current document,
   for example when a user explicitly requests reloading while following a
   link or when a link is added to a bookmark list.

[ Lynx's 'x' key is an example of reloading while following a link. ]   

   Klaus