Re: PATCH, gdiff, and random-access I/O from Julian Reschke on 2004-04-30 (ietf-http-wg@w3.org from April to June 2004)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 30 Apr 2004 10:52:29 +0200
To: Jamie Lokier <jamie@shareable.org>
Cc: Justin Chapweske <justin@chapweske.com>, Alex Rousskov <rousskov@measurement-factory.com>, HTTP working group <ietf-http-wg@w3.org>
Message-ID: <409213CD.7040705@gmx.de>

Jamie Lokier wrote:

> The gdiff format expressed these operations fairly straightforwardly.
> 
> Gdiff is just a sequence of "COPY n bytes from position m of original"
> and "insert n DATA bytes b0,b1,...bn-1".
> 
> A random access write of n bytes to position p is expressed simply in
> gdiff as:
> 
>     COPY 0,p
>     DATA n,b0,b1,...,bn-1
>     COPY p+n,length-p-n
> 
> The only difficulty for a filesystem driver is that it needs to know
> the length before sending the PATCH command.
> 
> If the driver and server are using Etags anyway, to ensure the PATCH
> applies to known content, then the driver will know the length.  Same
> if it's using WebDAV-style locking.
> 
> However, Etags are often based on things like the MD5 of the file's
> contents, which we don't want the server to have to recompute each time.
> 
> We should bear in mind that if we're trying to design a _good_
> read-write file server protocol, a bit more thought needs to go into
> the cacheing model.

Jamie,

my concerns are

- no standards-track documentation of GDIFF (as far as I understand)

and

- it seems to do more than we need.

Therefore the proposal to use a *very* simple format (pick one if it 
exists, otherwise invent one), put it into the PATCH specification and 
make that one (and only that one) mandatory.

Optimally, a reliable implementation of that format should be doable in 
one day.

Regards, Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760

Received on Friday, 30 April 2004 04:53:10 UTC