From: "Ronald E. Daniel" <email@example.com> Date: Wed, 17 May 1995 16:59:38 -0600 Message-Id: <199505172259.QAA24514@idaknow.acl.lanl.gov> To: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Re: Byte ranges -- formal spec proposal Cc: email@example.com Thus spoke Ari Luotonen <firstname.lastname@example.org> (at least on Wed, 17 May 1995) > _________________________________________________________________ > > BYTE RANGES WITH URLS AND HTTP We have been putting off the problem of fragment identifiers, and this is a good start on the problem. I have a few reflex objections about details - such as preferring 0-based addressing to 1-based - but they are very minor. My major objection is that I would like to see byterange addressing as one component in a more general fragment identification architecture. The "Miscellaneous" section, quoted below, mentions the possibility of combining different addressing schemes, but does not provide any specification. I would be a LOT happier if we could have an overall scheme that byterange, paragraph, row/col, word, stanza, and other addressing schemes could fit into. For example, I might want queries such as: Get the value of the <title> element in an HTML file http://host/path;generic-id="title" Get bytes 1-5 of the second paragraph of a file http://host/path;para=2&byterange=1-5 Get a portion of a JPEG http://host/path.jpg;rows=37-99&cols=53-200 When we start looking at the addressing needs of a variety of specifiers (rows/cols, paragraphs, ...) then we may find that we would prefer different choices of index base, inclusion or exclusion of the elements at the extremes of the range, etc. > Miscellaneous > > There are other kinds of ranges that can be addressed in a similar > fashion; this document does not define them, but both the URL > parameter and the Range: header are defined so that it is possible to > extend them. This byte range specification applies to any > content-type. There may be range schemes that are meaningful to only > certain types of documents. > > As an example, there might be a linerange URL parameter, with the same > kind of range specification, and the Range: header would then specify > the numbers in lines. Example: > > http://host/dir/foo;linerange=21-30 > > The response from a 123 line file would be: > > Range: lines 21-30/123 > > This could be useful for such things as structured text files like > address lists or digests of mail and news, but isn't meaningful to > such document types as GIF or PDF. > > Other examples might be document format specific ranges, such as > chapters: > > http://host/dir/foo;chapterrange=1-3 > > Range: chapters 1-3/12 > > Or just the first chapter: > > http://host/dir/foo;chapterrange=1 > > Range: chapters 1/12 > > MULTIPLE URL PARAMETERS > > If at some point there will be multiple simultaneous URL parameters, > they should be separated by the ampersand character (just like > multiple values are encoded in the FORM request). We need to define more than just the syntax of how multiple parameters will be seperated. We need to define the semantics of foo=n1-n2&bar=n3-n4. Does the "bar" parameter apply to the result of the "foo" parameter? Vice versa? Or do we return the two selections seperately the way you specify with foo=n1-n2,n3-n4 ? How are errors to be handled when we specify a range that is longer than the file? What about when the starting offset of the range is greater than the length of the file? Byteranges are pretty nice since they are broadly applicable, but I am not sure what it means to ask for a byterange of a database. This problem is even more acute when we get into parameters such as "paragraph", "row/col", "stanza", etc. How are we to indicate when a parameter is inappropriate for a URL, such as paragraph for an image? Usually row/col will be inappropriate for HTML files, but if we have previously selected a table then it is the natural way to get a table element. How do we do that? If we do not develop a uniform architecture for fragment identification, we are going to have a slew of partial solutions before we wise up and develop a uniform treatment. Then everyone will be pissed because of differing addressing conventions, code bloat, etc. and a total inability to make the uniform scheme match the previous partial solutions. My understanding is that HyTime can handle this uniform fragment identification. Can people knowledgeable about HyTime talk about the good *and bad* points of using HyTime addressing for URI fragment identification? Is there a way we can start small, with just byterange selection, then grow our capabilities? Ron Daniel Jr. email: email@example.com Advanced Computing Lab voice: (505) 665-0597 MS B-287 TA-3 Bldg. 2011 fax: (505) 665-4939 Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel/ Los Alamos, NM, 87545 tautology: "Conformity is very popular"