- From: Yaron Goland <yarong@microsoft.com>
- Date: Fri, 21 Mar 1997 00:49:09 -0800
- To: "'masinter@parc.xerox.com'" <masinter@parc.xerox.com>
- Cc: w3c-dist-auth@w3.org
As anyone running on a 28.8 modem or less will tell you, this isn't an optimization, this features determines if the user can function. I sat down with the Office people and they showed me some of their user scenarios, they involve having files on servers edited remotely by users sitting behind modems of all speeds and sizes. Office's latency problems are especially bad because they sell somewhere around 40% of their product outside the US. I believe you will find that anyone on this group who ships a commercial distributed authoring system will tell you that partial write isn't an optimization, it is a critical piece of their functionality. I know that is certainly true for Microsoft. But the need for partial PUTs isn't the issue here. With the STRUCTURE method it is possible to do a de facto partial put, regardless of anyone's feelings on the subject. The real question is - should a Range header be used? I had originally thought to use a RANGE method where one would submit a Range header and get back a URI. The problem with that is the performance hit. After talking with the IIS and Office folks we figured that the performance hit is so large that we had to use a Range header rather than a Range method. Again, I strongly suspect that Office and IIS's experience is representative. If not, I'm sure that the other members of the group who ship distributed authoring server products will speak up. I can only speak for Microsoft's experience in this area. The clean is the RANGE method. I think everyone can agree on that much. The problem is, its performance is no where near good enough for anyone who needs to sell a system to implement. So the question we need to ask ourselves is - Are we willing to sacrifice some cleanliness for interoperability? If everyone is going to implement something like a range header and if they all do it separately and thus differently, have we aided the cause of interoperability? I am not saying that we should sacrifice good design on the alter of performance. I am saying that sometimes there are close calls when the clean solution has not so great performance, and the close to clean solution has excellent performance. I believe this is one of those times. As for your observation on range lengths. If I understand your point then you would want to see the range header specify the range to be inserted into and then have the body placed in that range. If the range to be inserted into is say, 15-20 and the body is 10 bytes long, then the first 5 bytes of the range will be overwritten and the remaining 5 bytes of the request entity will be inserted. This is a nice feature but it complicates the implementation of the server. Keeping the operations as insert or overwrite not only makes for simpler code, it also makes it easier to leverage services already present in OS's. As for the issue of entity tags, there are times when one intentionally wants to overwrite a byte range where the resource has been altered since one last used it. However, if someone wants to make sure such a change has not occurred, then an entity tag is a great choice. It is certainly cheaper and easier than a lock. Yaron > -----Original Message----- > From: Larry Masinter [SMTP:masinter@parc.xerox.com] > Sent: Thursday, March 20, 1997 11:23 PM > To: Yaron Goland > Cc: w3c-dist-auth@w3.org > Subject: Re: Partial Puts > > Yaron Goland wrote: > > > > 1 Problem Description > > > > Clients who make small changes to resources do not wish to have to > > upload an entire entity. As such, some sort of partial write > capability > > is needed. > > The nice thing about the separate "problem description" is that it > lets > you consider whether this is a real problem, serious, and worth making > the protocol more complex. Is the "partial write capability" actually > required ("needed") or just an optimization? Is it so important that > interoperable clients cannot be written without it, or is it just a > convenient > optimization? > > > There are two types of partial writes, insertion writes and over > writes. > > Wait, there are writes that insert more, less, or exactly the same as > the range that they're replacing. It's not clear that you would really > need or want > to distinguish these. An insert is just replacing a zero-length range > with one > that isn't zero length. > > Also, there's something really troublesome about partial writes unless > the partial write is against a particular entity (as identified by a > strong > entity tag) rather than against a resource. > > > 2 Proposal > > > > I propose that the range header be used with the write-type header > to > > specify that a partial PUT. > > > > WriteType = "Write-Type" ":" 1#("INSERT" | "OVERWRITE" | Token) CRLF > > > > The INSERT and OVERWRITE values must not be used together. > > > > An INSERT indicates that the included entity should be inserted into > the > > location identified in the Range header, causing content already in > the > > resource to be moved forward. > > > > An OVERWRITE indicates that the request body should overwrite > whatever > > exists in the range specified by the range header. > > > > In both cases the Range header must only identify a single point. > For > > example, to specify that the request body should be inserted at byte > 30 > > one would include "Range: bytes= 30-30". > > > > An insertion at the beginning of a resource causes the entire > resource > > to be shifted forward to make room for the insertion. However an > > insertion must not specify an entry point beyond the end of the > > resource. > > > > An over write may have a range that is just beyond the end of the > > resource to indicate appending. In the case of bytes, the range > should > > specify exactly one byte beyond the end of the resource. > > > > If the content-type of the request body is multipart/byte-ranges > then > > the previous behavior may be generalized across the multipart > entries. > > The server may ignore any entry that does not have both a range and > > write-type header. The response should indicate that a range was > skipped > > due to the lack of either or both headers. > > > > The Write-Type header may only be used in conjunction with the Range > > header. > > > > In addition there are any number of resources where the use of range > and > > write-type make no sense. In such a case the resource should return > a > > 412 Precondition Failed. > > > > 3 Discussion > > > > Clearly having the Write-Type header dropped would be a very bad > thing. > > As such it is necessary to use a PEP extension in order to guarantee > > that the server will not process the method if it does not > understand > > the write-type or range headers. > > Just as range locking might be considered a case of locking a resource > which happens to be a range of another resource, perhaps range > (over)writing > might be instead characterized as PUT-ing to a resource which just > happens > to be a part of another resource. > > That is, for the resource "MyDocument" you ask for the URL for the > resource > which corresponds to "Page1". The server gives you back a URL for > Page1, > which > you PUT. The relationship between MyDocument and the Page1 resources > is > such > that any update to Page1 of course updates the first page of > MyDocument. > > This will work when the resource is stored as a file, as streams in an > SGML database, or with separate files per page without the client > having > to be aware of the server's internal representation of resources. > Perhaps > there's a round trip in URL discovery, but it keeps the semantics of > part/whole independent of the internal representation. >
Received on Friday, 21 March 1997 04:17:43 UTC