Re: Byte ranges -- formal spec proposal from Roy T. Fielding on 1995-05-19 (www-talk@w3.org from May to June 1995)

From: Roy T. Fielding <fielding@avron.ics.uci.edu>
Date: Thu, 18 May 1995 19:36:11 -0700
To: Multiple recipients of list <www-talk@www10.w3.org>
Message-Id: <9505181936.aa20333@paris.ics.uci.edu>
Brian writes:

> So, does a "byte range" constitute a variation of the object, or a new
> object itself, which deserves a unique URL?  Compelling cases could be
> made on either side, but I think in this situation it truely is a
> variation of the object.  But now we have a problem - the WWW Link Model
> (hi roy!) only lets me link to *objects* (i.e., URL's), not particular
> variations/representations of objects, if I understand things correctly. 
> For example, if I have an object that represents my home page, and my home
> page object returns both HTML 2.0 and HTML 3.0 representations of itself,
> there's no way for me to *force* an HTML 2.0 browser to see the HTML 3.0
> representation without giving the HTML 3.0 representation its own, 
> un-content-negotiated URL.  Feh. 

That is quite correct -- in fact, I came to the same conclusion Tuesday
night (while lying in bed staring at the ceiling, of course) regarding
the need for URL parameters for versioning and content negotiation.
The same applies to byte ranges, page ranges, etc.

> Okay, so here's the problem.  A URL must be able, not required, but able,
> to *completely* describe the request for an object.  In other words, URL's
> must be able to point to particular representations of webbable objects. 

Yep.

> The protocol "method" used.

Nope.

>                              The additional headers.

Only insofar as they affected the chosen representation.

> In fact, in most
> situations today URL's are used to point to representations instead of
> objects - content providers are simply creating unique URL's to every
> representation.  So, we're not breaking anything fundamental here, it
> seems.  Further more: 
> 
> 1) There must be a clear distinction between the part of the URL that 
> describes the *object*, and the part of the URL that describes its 
> representation.

I'll disagree here -- there only needs to be a distinction between variants;
what that distinction is may vary from parameters to file name extensions.

> 2) User-agents must be able to deal with the part of the URL that 
> describes the representation at a higher level - for example, when a user 
> goes to "bookmark" the object, they are asked to chose whether they want 
> to bookmark the object in general or the particular representation of 
> that object.  

UAs need to be able to identify the non-variant part of the URL.

> 3) Responses need to indicate which parts of that representation request 
> influenced the output, so that caches know what to key on (and don't 
> needlessly key on everything in the request.)  I think there's a "vary" 
> header proposed somewhere....

Responses need to include a Location: header which defines the exact
variant chosen, and URI headers which define the available variants
for the resource.

> 4) There must be a defined list of "sanctimonious" headers in HTTP, ones 
> which are always part of the request and are *not* modifiable by the 
> representation-part of the URL.  For example, User-Agent:, or From:.  
> Likewise, content providers should not vary content based on these headers.

Er, well, there's no way to enforce that.  I prefer just requiring that
the UA be informed of what they've received.

> Phew.
> 
> (btw, the CD I'm listening to now seems highly conducive to these kind of 
> thought processes - Air, by Pete Namlook, on FAX)

I nice comfy bed works well too.

> So, here's how I think things should look.  The format:
> 
>   http://host/path/to/object?object_arguments;request_headers

*BOING*  phooey.  The answer is in the Relative URL draft.
I will define "the http URL" in the HTTP/1.0 draft, and it will
be based on the generic-RL syntax.  We can include an appendix
on http URL conventions if that is acceptable to the HTTP WG.

Question:  Why the verbose names?  I preferred "bytes" over "byterange",
and would prefer just "b" even more.  But, I can see where some names
are best left readable and around 4-5 characters:

     http://site/foo;byte=1-100000
     http://site/foo;line=1-100
     http://site/foo;page=4-7
     http://site/foo;chapter=1-100
     http://site/foo;language=en-us
     http://site/foo;version=1.1
     http://site/foo;type=text/html%25version%3D3

     http://site/foo;chapter=1-100?fred+barney
     http://site/foo;language=mi;chapter=1-100?haka+pakeha

NOTE: the similarity between the URI: header's vary values and
the parameter names is mandatory for caching to work sensibly.

     GET http://site/foo?haka+pakeha HTTP/1.0
     User-Agent: Me
     Accept: text/html;q=1, */*;q=0.5
     Accept-Language: mi;q=1, en;q=0.9

     HTTP/1.0 200 OK
     Content-type: application/pdf
     Location: http://site/foo;language=mi;chapter=1-100?haka+pakeha
     URI: <http://site/foo>;vary="byte,chapter,language",
          <http://site/foo;language="mi">;vary="byte,chapter",
          <http://site/foo;language="en-gb">;vary="byte,chapter",
          <urn:/NZ/treaties/waikato>;vary="byte,chapter,language"
     

It is my personal opinion that multiple byte ranges in a single
URL are not useful and only make life difficult -- multiple requests
are more appropriate.


 ....Roy T. Fielding  Department of ICS, University of California, Irvine USA
                                       <fielding@ics.uci.edu>
                      <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Thursday, 18 May 1995 23:22:23 UTC