Re: Partial Puts from Larry Masinter on 1997-03-21 (w3c-dist-auth@w3.org from January to March 1997)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Thu, 20 Mar 1997 23:22:57 PST
To: Yaron Goland <yarong@microsoft.com>
CC: w3c-dist-auth@w3.org
Message-ID: <33323751.2522@parc.xerox.com>
Yaron Goland wrote:
> 
> 1       Problem Description
> 
> Clients who make small changes to resources do not wish to have to
> upload an entire entity. As such, some sort of partial write capability
> is needed.

The nice thing about the separate "problem description" is that it lets
you consider whether this is a real problem, serious, and worth making
the protocol more complex. Is the "partial write capability" actually
required ("needed") or just an optimization? Is it so important that
interoperable clients cannot be written without it, or is it just a
convenient
optimization?

> There are two types of partial writes, insertion writes and over writes.

Wait, there are writes that insert more, less, or exactly the same as
the range that they're replacing. It's not clear that you would really
need or want
to distinguish these. An insert is just replacing a zero-length range
with one
that isn't zero length.

Also, there's something really troublesome about partial writes unless
the partial write is against a particular entity (as identified by a
strong
entity tag) rather than against a resource.

> 2       Proposal
> 
> I propose that the range header be used with the write-type header to
> specify that a partial PUT.
> 
> WriteType = "Write-Type" ":" 1#("INSERT" | "OVERWRITE" | Token) CRLF
> 
> The INSERT and OVERWRITE values must not be used together.
> 
> An INSERT indicates that the included entity should be inserted into the
> location identified in the Range header, causing content already in the
> resource to be moved forward.
> 
> An OVERWRITE indicates that the request body should overwrite whatever
> exists in the range specified by the range header.
> 
> In both cases the Range header must only identify a single point. For
> example, to specify that the request body should be inserted at byte 30
> one would include "Range: bytes= 30-30".
> 
> An insertion at the beginning of a resource causes the entire resource
> to be shifted forward to make room for the insertion. However an
> insertion must not specify an entry point beyond the end of the
> resource.
> 
> An over write may have a range that is just beyond the end of the
> resource to indicate appending. In the case of bytes, the range should
> specify exactly one byte beyond the end of the resource.
> 
> If the content-type of the request body is multipart/byte-ranges then
> the previous behavior may be generalized across the multipart entries.
> The server may ignore any entry that does not have both a range and
> write-type header. The response should indicate that a range was skipped
> due to the lack of either or both headers.
> 
> The Write-Type header may only be used in conjunction with the Range
> header.
> 
> In addition there are any number of resources where the use of range and
> write-type make no sense. In such a case the resource should return a
> 412 Precondition Failed.
> 
> 3       Discussion
> 
> Clearly having the Write-Type header dropped would be a very bad thing.
> As such it is necessary to use a PEP extension in order to guarantee
> that the server will not process the method if it does not understand
> the write-type or range headers.

Just as range locking might be considered a case of locking a resource
which happens to be a range of another resource, perhaps range
(over)writing
might be instead characterized as PUT-ing to a resource which just
happens
to be a part of another resource.

That is, for the resource "MyDocument" you ask for the URL for the
resource
which corresponds to "Page1". The server gives you back a URL for Page1,
which
you PUT. The relationship between MyDocument and the Page1 resources is
such
that any update to Page1 of course updates the first page of MyDocument.

This will work when the resource is stored as a file, as streams in an
SGML database, or with separate files per page without the client having
to be aware of the server's internal representation of resources.
Perhaps
there's a round trip in URL discovery, but it keeps the semantics of
part/whole independent of the internal representation.
Received on Friday, 21 March 1997 03:03:26 UTC