Re: URL: Background and Requirements from Gomer Thomas on 1998-11-03 (www-tv@w3.org from October to December 1998)

From: Gomer Thomas <gomer@lgerca.com>
Date: Tue, 03 Nov 1998 14:54:38 -0500
To: Warner ten Kate <tenkate@natlab.research.philips.com>
CC: www-tv <www-tv@w3.org>
Message-ID: <363F5F7E.FAFF5E63@lgerca.com>
My understanding of the distinction between a URI and a URL comes primarily from
my reading of RFC 2396. (Of course, I may not have read it correctly!)

In section 1.2 it states "The term 'Uniform Resource Locator' (URL) refers to the
subset of URI that identify resources via a representation of their primary
access mechanism ..." Further down it mentions that "... both DNS and HTTP are
typically used to access an 'http' URL's resource  ..." This implies to me that
the term URL includes the usual form we find in HTML pages, based on an Internet
domain name. As you probably know, this does not necessarily resolve to a unique
location, since a DNS server may return multiple IP addresses for a single
Internet domain name. Sometimes these refer to multiple network interfaces on the
same host, but sometimes they refer to separate hosts, for example in the case of
mirrored servers.

Further down it in section 1.2 it states "A URN differs from a URL in that it's
[sic] primary purpose is persistent labeling of a resource with an identifier."
Thus, there seems to be no intent that a URN (which is a subset of URI) provide
any assistance whatsoever in locating the resource.

Having said all that, I am in complete agreement with your intent that we should
be allowed "to look for solutions which enable a single resource to be retrieved
from multiple locations." I had never intended to exclude this possibility, and
in fact the solutions which I have been considering do allow it. I cheerfully
accept any changes to the wording of the requirements which make that more clear.

I think the solution to the problem of multiple interlinked documents getting
moved to local storage lies in the use of relative URLs, as you suggest, together
with some mechanism to adjust the base URL appropriately.

The solutions I have been thinking of handle the problem of rescheduled events. I
will try to write up my ideas in this area as soon as I get a couple of other
brush fires out of the way -- and as soon as we converge on the requirements
(which seems to be very close to happening).

Gomer Thomas

Warner ten Kate wrote:

> Gomer Thomas wrote:
> >
>
> > I'm not sure I understand your point about changing the term URL to URI. As
> > I understand it, the key difference between a URI and a URL is that a URI
> > is just an identifier of a resource, whereas a URL is a URI which actually
> > allows the location of the resource to be determined. Perhaps I am not
> > understanding this correctly.
>
> I am not very literate either, reading RFC 2396 may help,
> RFC 1630 also provides some backgrounds. I understand a URI
> to be the more general abstraction from URL and URN. All three
> allow the location of the resource to be determined. The URL
> is explicit in that; it points to one single location. The URN
> is indirect; it requires a naming service to resolve the location(s).
>
> My proposed change in the requirements allows us to look for
> solutions which enable a single resource to be retrieved from
> multiple locations. By that change I am not requiring that
> such should be the case, but I am allowing it. Your requirement
> excludes that solution.
>
> We both require that the location must be determined by
> the URI/L, which is formulated in the second sentence of
> the original requirement.
>
> >
> > I don't agree with your comments about home/local servers, if I understand
> > them correctly. When I download a file from an Internet server and store it
> > on my local disk, the URL needed to reference the copy on my disk is
> > different from the URL needed to reference the original copy on the
> > Internet server. Similarly, if I record the 6 o'clock news from channel 5
> > and store it on my local disk, I would expect that the URL needed to
> > reference the local copy on my disk would be different from the URL needed
> > to reference the original broadcast. Aside from the difference in location,
> > the new copy is now available to me at any time, whereas the original
> > broadcast was only available to me at a specific date-time.
> >
>
> That's the more general problem I would like to tackle.
> Likely we have a misunderstanding. Maybe the following
> examples help.
>
> Assume there is a HTML document (document A) containing all
> kind of links to other HTML documents (B). Some of them (B)
> get downloaded on your local disc, some not. The URLs in the
> document A are referencing the original B Web-site, and some
> mechanism is required to update the URLs in document A.
>
> [Altough we are talking about requirements, I guess this can
> partly be solved by using relative-URLs; the subtrees are
> required to remain identical, and the issue remains wrt.
> the base-URL.]
>
> Now consider the case that document A is transmitted in a
> broadcast channel, together with the documents B. The URLs
> in document A contain the information when documents B
> are "on the air". By one or another reason the broadcast
> get rescheduled. Who is taking care that the URL information
> in document A is adapted (and how is he doing that) ?
> And, indeed, when all documents are stored on my local disc,
> how do I get rid of the timing information in the URLs ?
>
> I agree that the content stored on your local disc is at
> another location than the original one. How do we inform
> the application document using that content about the change ?
>
> I think we need both an URL scheme which points to a specific
> location, as some additional scheme which creates a level of
> indirection and enables to solve the type of problems I described.
>
> [Expanding the caching strategy, as described by Glenn Adams,
> to persistent storage (VCR) is interesting, and generates,
> a kind of implicit level of indirection: first look at your
> VCR than in the broadcast stream. There is, however, not a way
> to manage explicitly that 'indirection' scheme. For example,
> I am not sure how it expands if multiple storage media (camcorder)
> connected to an in-home network are involved. What are the (implicit)
> precedence rules ? Are all devices on the network to be checked
> if they contain the data requested (non-expired) ? Does it imply
> that during the original broadcast my user agent also has to check
> all my local storage devices at the in-home network ?]
>
> Warner.



--
Gomer Thomas
LGERCA, Inc.
40 Washington Road
Princeton Junction, NJ 08550
phone: 609-716-3513
fax: 609-716-3503
Received on Tuesday, 3 November 1998 14:54:33 UTC