- From: Brian Behlendorf <brian@organic.com>
- Date: Sat, 20 May 1995 20:49:20 -0700 (PDT)
- To: www-talk@www10.w3.org
One last thing I'd like to throw into the conversation here, and then I feel all has been said that needs to be said, and we should try and find rough consensus. (BTW, are there any "tools" for finding what can be considered rough consensus? A straight WWW-based poll is probably out of line...) ************************************************************************ SUMMARY: all we really need, standards-wise, is a new 300-level HTTP response, "Contains". ************************************************************************ Several good examples have been brought up of files that can be comprised of segments, where each of those segments is a valid file of the same data-type, as an argument for this proposal. However, in almost all of the examples, there were only *specific* byte ranges which would work, in which the requested object would really be usable. Thus, for most of these examples, you could just ask for "parts 0-3" or "2-5" or "3-end", and the right thing would happen. In only one of the examples was *true* random access necessary, and that was to resume downloading of a file if it was interrupted part of the way through. Keep this example off to the side for the next few paragraphs. Instead of thinking about one URL that represents a collection of objects, why not give each object their own unique URL, and devise a way of addressing a collection of URL's? This is similar to byterange, but more general. Let's say somewhere a mapping takes place that translates URL1 into a container for URL2, URL3, URL4, etc. I have a hunch this is URC/URI territory, but I don't know enough yet about the specific URC proposals floating around yet to know if this is already being considered. So, it works like this: Client asks for URL1. URL1 gets mapped at a server somewhere into a composite body whose parts are URL2, URL3, and URL4. * If it doesn't find a place to either inline or link URL3, URL4, etc., it's up to the browser to figure out how to represent that "auxiliary" file. Maybe it just keeps it around until it can be represented later. Caches work just as they always have. If they can cache that container mapping, so much the better. The important thing is that URL2, URL3, URL4, etc., can be ANYTHING THEY WANT TO BE - there's no need to give them some sort of formal syntax, caches know from the mapping from URL1 how they assemble together. If the server prefers knowing them as byteranges, it doesn't matter. I.e., we can have http://host/path/file is-a-container-for http://host/path/file;byterange=0-30 http://host/path/file;byterange=31-60 or http://host/path/file is-a-container-for http://host/path/file?part1 http://host/path/file?part2 or even http://host/path/file is-a-container-for http://host/path/file2 http://host2/path/script ftp://host3/path/file3 and either way the client or proxy will know when it has the whole object, or just its parts. Finally, this also allows "parts" to be members of more than one container, something none of the byterange proposals had considered. I think this is a good thing, can anyone think of a situation where this isn't? In fact they can even be on completely separate servers. Yes, THIS REQUIRES CHANGES TO BROWSERS AND SERVERS. Minimally. Why are we so afraid of that? There are a couple really good side effects now that I think about it. For example, right now Netscape's progressive-rendering algorithm has to wait until it recognizes a reference to an inlined image before it can start grabbing it. If it could be told that "URL1 contains this HTML page and these inlined images" then it could possibly be more efficient in what it does. Additionally, a content provider could "bundle" icons with one page that weren't necessarily inlined on that page, but which are used by subsequent pages, so that when visitors go to that subsequent page, the icons are already loaded. I can give plenty of examples of how this could work for just about every application discussed so far. It would seem to be pretty straightforward for servers to generate these mappings for a large PDF file, presuming there's some way for it to query the PDF file to know where it can be segmented. So, now, back to the resume-downloading-at-point-x. This is semantically a much different operation than "give me part x", so let's just give it its own request header: Startbyte: 204567 ....would mean start the post-response-header transmission at byte 204567 into the response, counting from the end of the response headers (\r\n\r\n, or \n\n). Who cares if this is a CGI script or actual file, eh? :) ******************************************************************** So, I suppose in the end I'm proposing a new 300-level HTTP header, something like 305 Contains Mapping o Following: anything o Required Headers: none The server returns an HTTP object comprised of a newline-delimited list of URI's which this URL is said to "contain". The client is expected to fetch these URL's and plug them together, representing this requested URL as the canonical URL for this collection. The other HTTP headers on this object apply *only* to this object, and this response should be cached where possible. ******************************************************************* *Feedback*, please. I hate having all these ideas and no time to implement them in a browser (though I'd be happy to implement this on the server side in Apache). Roy? Dan? Henrik? Brian --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/ * - Order is insignificant - a browser first starts rendering URL2 and looks for where to start plugging in URL3, etc, but that should just be an optimization, browsers can plug things together however they wish. Some network-aware file formats like VRML already have the concept of nesting inlines, which HTML doesn't have (yet), so that order could to be created by a depth- or breadth-first traversal of the scene to aid rendering, but in a real directed graph that's not necessary.
Received on Saturday, 20 May 1995 23:49:24 UTC