- From: Terry Allen <terry@ora.com>
- Date: Sun, 29 Jan 1995 11:36:58 PST
- To: uri@bunyip.com, davenport@ora.com
Dan's helpful post is both useful in itself and helps define some of the issues. I want first to comment on the "brain dump" section and then, in a second post, respond to the specifics of the proposal in Dan's second section. | Date: Fri, 27 Jan 95 19:04:04 EST | Message-Id: <9501272343.AA22920@ulua.hal.com> | From: "Daniel W. Connolly" <connolly@hal.com> | To: Multiple recipients of list <html-wg@oclc.org> | Subject: Redundancy in links, Davenport Prososal [long] | I copied all these lists becaue I think there may be interested folks | on all these lists. I suggest follow-ups be sent only to | uri@bunyip.com and davenport@ora.com.] Yes, please. | In message <199501271917.LAA24883@rock>, Terry Allen writes: | >Dan says | >>For example, if there's a postscript file on an FTP server out there | >called "report_127," you effectively can't link to it given today's | >web. | >But doesn't that mean simply that not enough info is being sent | >about the file by the server, or that the client isn't smart enough? | >Putting a content-type att on <A> seems like a fragile solution | >to the problem, as it shifts responsibility to the author of | >the doc, who is in most cases just a poor dumb human. | | Yes, it's fragile, but it's better than completely broken. | This is _distributed_ hypertext. It spans domains of authority. As an | author, I have authority over the info I put in the link, but I may | not have the authority to change the filename on the server. So I'm | stuck. I would much rather have the client deal with the situation than the human author, who, after all, has actually pointed at the right thing. When you load a file in some foreign format into Word for Windows (and does someone know of Word for DOS that will run on a 486?) the program checks out the file, guesses at its format, and offers you a bunch of options for conversion. Except for the conversion step, there's no reason Web clients shouldn't do the same thing. ... | >From the evidence that I have studied, the way to make links more | reliable is not to deploy some new centralized namespace (ala URNs | with publisher id's), but to put more redundant info in links. | Rather than looking at the web as documents addressed by an | identifier, I think we should look at it as a great big | content-addressable-memory. "Give me the document written by Fred in | 1992 whose title is 'authentication in distributed systems'." | I think the same sort of thing that makes for a high-quality citation | in written materials will make for a reliable link in a distributed | hypermedia system. A robust _link_ should look like a BibTex entry | (MARC record, etc.) Ah, yes, a link that is expected to be robust might well make reference to an entry in its document's bibliography. Writers of long documented papers do well to construct the bibliography and full footnotes as they go along, rather than having to scurry around the library getting all that info at the end of the job (he says priggishly); similarly, if you really want to have a robust reference to something on the Net you'd do well to collect its URC when you first link to it, for later reference. But I wouldn't want to put the info directly in the link. And at the other end I think there has to be some index to the content-addressable-memory space, of which URNs are only a part. | Given a system like harvest[2], ... | So if I as the link author know more than the reader's client can get | from the FTP server, I should be _able_ to contribute the knowledge | that I have. Making all the authors put content type info in their | links is the the wrong answer; the optimal solution is for the | provider to adapt to the .ps convention. But the link author should | be able to add value and quality despite the poor efforts of the | FTP server maintainer. I think the link author shouldn't have to add that info no matter what. If clients can't handle FTPable .ps files without the name extension, something's broken between the FTP server and the client. Fixing it in the markup of the document is patching the wrong spot. | "But the link author could just copy that file and put a .ps extension | on his own machine," you might reply. This doesn't allow for the case No, I thought of that (and another strategy of mapping locally a filename.ps to the remote filename) and rejected both for the reason you now give: | when the document in question changes daily, and it doesn't provide an | audit trail, and it violates my #1 engineering principal: never | maintain the same information in more than one place. Exactly. The information about the content type of the file should be maintained by the server, but is also inherent in the object. If my client can't get it from the server it can do some minor work to deduce that information. But the link itself is about the worst place to put that information from the standpoint of human engineering: entry of the information is prone to error, the information can't be validated by parsing the document, and the format of the target may change unbeknownst to the author. ... | The URN model of publisher ID/local-identifier may be sufficient for | the applications of moving the traditional publishing model onto the | web. But that is only one application of the technology that it takes | to achieve high quality links. Another application may have some other | idea of what the "critical meta-information" is. For example, for bulk | file distribution (ala archie/ftp), the MD5 is critical. But you wouldn't suggest adding an attribute for HTML to allow putting MD5 info in the link, would you? (This thread was originally about HTML markup design.) That info could easily go in a bibliographic entry pointed to by the link, to achieve the robustness you rightly desire. | OK... so... now that I've a brian dump On to the next post ... -- Terry Allen (terry@ora.com) O'Reilly & Associates, Inc. Editor, Digital Media Group 101 Morris St. Sebastopol, Calif., 95472 monthly column at: http://www.ora.com/gnn/meta/imedia/webworks/allen/ A Davenport Group sponsor. For information on the Davenport Group see ftp://ftp.ora.com/pub/davenport/README.html or http://www.ora.com/davenport/README.html
Received on Sunday, 29 January 1995 14:37:15 UTC