Re: HTTP Spec: PUT without data transfer, since hash of data is known to server

On 2015-10-07 16:26, Ed McClanahan wrote:
> Hmm... HTTP PATCH sounds like a problem then. Imagine that a previous
> PUT of some other resource included said hash. A later PATCH modifies a
> portion of that old resource. In order to be able to reference the new
> content of that old resource, a new hash for the entire resource needs
> to be recalculated. Not very practical for small PATCHes to large
> resources....
>
> Still, it seems HTTP PATCH also provides an elegant solution.. Using
> PATCH, they payload could be a simple "the data for my new resource has
> this hash" rather than the data itself. The HTTP server could accept or
> reject the PATCH request based upon whether or not it has seen this hash
> before. If rejected, the client just does the normal PUT with unique
> data anyway.

Right, that's one way to do it that is easier to implement than 
extending PUT. (Essentially a new Internet Media Type with 
PATCH-specific semantics)

Another approach would be the use of a new Content-Coding...

> Going further, some sort of rsync like HTTP PATCH payload could be used
> where blocks of the resource to be loaded are individually hashed. The
> PATCH response could be "OK, I have these blocks but not those". A
> subsequent PATCH could upload only those blocks that contain new data.
>
> I would like to add that hashes aren't perfect - most notably MD5. False
> positives would seemingly be a problem. Some scheme might be needed to
> be able to detect false positives.
>
> Finally, there is definitely a security question. The best example of it
> was once described to me this way:

Right. Google for deduplication + security.

> 1) I work at a company that archives the form letters containing all job
> offers differing only by the employee's name and salary.
>
> 2) I want to know John Smith's salary (i.e. I know his name but not his
> salary).
>
> 3) I compose a series of form letter offers each with John Smith's name
> but with varying salaries.
>
> 4) I try this dedupe-able PUT/PATCH operation for each such offer letter.
>
> 5) My HTTP client reports which one is dedupe-able.
>
> The result of #5 reveals John Smith's salary. Oops!
>
> Just wanted to throw out there my PATCH alternative.
> ...


Best regards, Julian

Received on Wednesday, 7 October 2015 18:26:44 UTC