- From: Gavin Nicol <gtn@ebt.com>
- Date: Sun, 21 May 1995 08:46:09 -0400
- To: www-talk@www10.w3.org
>Several good examples have been brought up of files that can be comprised of >segments, where each of those segments is a valid file of the same data-type, >as an argument for this proposal. However, in almost all of the examples, >there were only *specific* byte ranges which would work, in which the >requested object would really be usable. Thus, for most of these examples, >you could just ask for "parts 0-3" or "2-5" or "3-end", and the right thing >would happen. In only one of the examples was *true* random access >necessary, and that was to resume downloading of a file if it was interrupted >part of the way through. Keep this example off to the side for the next few >paragraphs. I've been meaning to write up an RFC on how DynaWeb handles large files. As I've said, DynaWeb breaks a document into parts based on the structure of the data. In particular, DynaWeb does runtime conversion from SGML to HTML, and the smallest addressable part of a document in DynaWeb is a single SGML element. As you all probably know, an SGML document basically forms a heirarchy of nested elements, or in other words, a tree. Filesystems, in general, are also trees. It seemed natural to me to use the same *type* of URL for files, and for sub-document addressing. As such, DynaWeb actually supports 3 sub=document addressing modes, which are pretty much taken straight from the TEI guidelines: http://www.ebt.com/collection/book/doc=1/chap=2/sect=3 http://www.ebt.com/collection/book/1/2/3 http://www.ebt.com/collection/book/1 The first form accesses elements in the heirarchy by *typed* child number, the second form accesses elements based on child number, irrespective of type, and the last is a direct element address. In practice, because few people ever access the server except by browsing, the last form can be used in most cases. I would like to argue that such an addressing scheme is applicable to many other types of data as well. As I said before, my real problem with byte-ranges is that generally, they don't make sense. Ranges of *parts* does make sense however. One other problem I have is that the format of a URL should really be application dependent, so why make recommendations for cases where it is meaningless? Let's leave it to the application (ie. the server), until we are ready to design a far more general linking mechanism. Loot at http://www.ebt.com/ to see how DynaWeb works. PS. I should note that the above naming scheme is very, very useful in our case, but it drives spiders wild....
Received on Sunday, 21 May 1995 08:43:19 UTC