- From: Brian Behlendorf <brian@organic.com>
- Date: Thu, 18 May 1995 00:01:15 -0700 (PDT)
- To: www-talk@w3.org
On Wed, 17 May 1995, Larry Masinter wrote:
> I'm getting this discussion 3 times, http-wg, www-talk, and now on
> uri. I suggest keeping the discussion on www-talk for now.
Done. Are John Franks and Ari aboard?
> The proposal is to add byte ranges to URLs (in general, it seems). I
> don't think it belongs there; at best, byte ranges make sense as an
> addon to the HTTP protocol.
Then how does one build a URL to point to minutes 6 through 8 of a 1-hour
60-megabyte DJ set? Or to message 2064 (byte range 2254322-2257934) in a
4 megabyte mailbox archive? Sure, if I'm using an HTTP-aware mailbox
reader or audio viewer that's a possibility... but then I can only launch
a range request from that type of viewer. Ick.
I do see what you're getting at, though. There is *not* necessarily a
direct mapping between a URL and the representation of the object that
URL refers to as returned by the server. Key definitions (and correct me
if I'm wrong, but at least this is how I think most developers are
conceiving the web):
1) A URL is a pointer to an *object* somewhere. (well, okay, object=resource)
2) An object can be anything - only *representations* of an object are
directly viewable (flat files or program output, for example, are
both representations).
3) A *representation* is what we get back when we perform an action on a URL
(GET, POST, etc) in the context of an HTTP request. You do not "GET" the
object itself.
4) *Representations* can be influenced by any part of the request, not
just the URI. For example, headers which are known to make a difference:
Accept (content negotiation), If-Modified-Since, WWW-Authorization, hell
even User-Agent in certain situations.
(quick question - isn't it inherently more scalable to distribute and
cache the objects themselves rather than their (possibly
numerous) representations? Hmm..)
So, does a "byte range" constitute a variation of the object, or a new
object itself, which deserves a unique URL? Compelling cases could be
made on either side, but I think in this situation it truely is a
variation of the object. But now we have a problem - the WWW Link Model
(hi roy!) only lets me link to *objects* (i.e., URL's), not particular
variations/representations of objects, if I understand things correctly.
For example, if I have an object that represents my home page, and my home
page object returns both HTML 2.0 and HTML 3.0 representations of itself,
there's no way for me to *force* an HTML 2.0 browser to see the HTML 3.0
representation without giving the HTML 3.0 representation its own,
un-content-negotiated URL. Feh.
Okay, so here's the problem. A URL must be able, not required, but able,
to *completely* describe the request for an object. In other words, URL's
must be able to point to particular representations of webbable objects.
The protocol "method" used. The additional headers. In fact, in most
situations today URL's are used to point to representations instead of
objects - content providers are simply creating unique URL's to every
representation. So, we're not breaking anything fundamental here, it
seems. Further more:
1) There must be a clear distinction between the part of the URL that
describes the *object*, and the part of the URL that describes its
representation.
2) User-agents must be able to deal with the part of the URL that
describes the representation at a higher level - for example, when a user
goes to "bookmark" the object, they are asked to chose whether they want
to bookmark the object in general or the particular representation of
that object.
3) Responses need to indicate which parts of that representation request
influenced the output, so that caches know what to key on (and don't
needlessly key on everything in the request.) I think there's a "vary"
header proposed somewhere....
4) There must be a defined list of "sanctimonious" headers in HTTP, ones
which are always part of the request and are *not* modifiable by the
representation-part of the URL. For example, User-Agent:, or From:.
Likewise, content providers should not vary content based on these headers.
Phew.
(btw, the CD I'm listening to now seems highly conducive to these kind of
thought processes - Air, by Pete Namlook, on FAX)
So, here's how I think things should look. The format:
http://host/path/to/object?object_arguments;request_headers
object_arguments: a url-encoded list of name-value pairs
i.e. name=brian&age=22
request_headers: a url-encoded list of request headers, which only
make sense in the context of the protocol used (in this case HTTP)
This generality is so that URL's aren't hindered by HTTP-only
specifications.
So that the browser's request looks something like
(connect to host port 80)
GET /path/to/object?object_arguments HTTP/1.0
User-Agent: Godzilla
request_header.name1=request_header.value1
request_header.name2=request_header.value2
For the purposes of this exposition, the HTTP header referring to
byteranges would be something like "ByteRange:". Something more general
is needed for other segments of course.
Some sample URL's:
a pointer to a sound file of clinton's weekly radio address:
http://www.npr.org/clinton/week23
a pointer to an MPG version of clinton's weekly radio address:
http://www.npr.org/clinton/week23;Accept=audio/x-mpeg
a pointer to byte range 10234234-13244212 of clinton's weekly radio address:
http://www.npr.org/clinton/week23;Accept=audio/x-mpeg&Byterange=10234234-13244212
I can already sense some problems. Here's an interesting URL:
http://whitehouse.gov:25/;MAIL+FROM=madmad@bomber.org&RCPT+TOpresident&DATA\nFrontLawn,2pm,May16th\n.\n
Though I suppose some catches could be put in place for this situation,
can we protect against that for every protocol? At what point does a
sufficiently obfuscated (to the human eye) extended URL become a malicious
virus-ish mechanism for mayhem?
Food for thought, hopefully I'm not too far off base on some of these.
Dan, Roy, let me have it. :)
Brian
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/
Received on Thursday, 18 May 1995 03:01:22 UTC