Re: Comments on draft-dusseault-http-patch-06 from Roy T. Fielding on 2004-10-18 (ietf-http-wg@w3.org from October to December 2004)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Mon, 18 Oct 2004 16:18:50 -0700
To: Lisa Dusseault <lisa@osafoundation.org>
Cc: HTTP working group <ietf-http-wg@w3.org>
Message-Id: <120FCC4A-215C-11D9-8A83-000393753936@gbiv.com>
On Oct 18, 2004, at 10:21 AM, Lisa Dusseault wrote:
> Why is this redefining MIME?  I understand how it's a different 
> messaging model of HTTP from the one you prefer, but I don't 
> understand how it redefines MIME.  As I understand it, RFC3229 uses 
> MIME types to describe the entity or instance without redefining MIME.

Content-Type is a MIME header field.  It is defined by MIME.
MIME and HTTP share a header field space, which was discussed
extensively in 1994.  The Area Directors (at that time) absolutely
forbid HTTP to redefine the meaning of existing e-mail header fields
because there was general concern that such messages might leak out
into the e-mail space and cause rampant confusion.

> As for redefining the messaging model of HTTP: RFC2616 has other 
> headers that refer to the entity or the instance, and not to the 
> message body.  For example, "The MD5 digest is computed based on the 
> content of the entity-body, including any content-coding that has been 
> applied, but not including any transfer-encoding applied to the 
> message-body."   That means that Content-MD5 is calculated from what 
> RFC3229 labels an "instance".

That is incorrect.  3229 defines instance (what I had been calling
"representation" since 1996, but which doesn't appear in 2616 because
Jeff objected to the name and the other editors were unwilling to
choose one name over another, but that is a much longer story) as
what would be transferred in a 200 response.  The entity-body, in
contrast, is the sequence of octets inside the message-body after
decoding the transfer-encoding (if any).  Thus, Content-MD5 is
calculated based on what is in the message, not the "instance".
See section 9 of RFC 3229.

> It seems to me that RFC3229 only defined a name, "instance", for 
> something which already existed.

Yeah, for "representation", which is only the same as entity-body when
it is a 200 response to a GET request.  RFC 3229 defines 226 responses.

> I do realize that RFC2616 is ambiguous, and I don't mean to justify 
> one theory based on legalistic readings of the spec.  On the one hand, 
> RFC2616 says "Content-Type specifies the media type of the underlying 
> data.".  On the other hand, RFC2616 says "Any HTTP/1.1 message 
> containing an entity-body SHOULD include a Content-Type header field 
> defining the media type of that body."  Both these sentences are in 
> section 7.2.1! The first sentence seems to support RFC3229's model; 
> the second sentence seems to support your (Roy's) preferred model.  
> Standards are rarely perfect; ambiguous language like this leaves lots 
> of room for interpretation of how the model can be refined and how 
> HTTP can be extended.  So I tend to devalue arguments based only on 
> selected text from the standard; I'd like to hear more arguments based 
> on harmful intermediary interactions or implementation concerns.

There is nothing ambiguous about those two statements, and certainly
not the rest of 7.2.1.  Content-Type is an optional field that defaults
to application/octet-stream.  That is the only reason it says SHOULD.

>> There is no way to do that in HTTP
>> because there is no way to check that all servers between the client
>> and the origin are aware of this abuse before sending the message,
>> nor is there any technical reason for redefining an existing header
>> field to mean something other than what is in two Draft Standards.
>
> I'm definitely sympathetic to concerns about intermediaries (that's 
> why I'm doing PATCH -- because I'm worried about the way XCAP is 
> redefining PUT and GET).  However, I don't yet understand what an 
> intermediary might do with a new method it hasn't seen before (PATCH) 
> that has a header it has seen before (Content-Type) that could be 
> harmful.  An intermediary can't cache the body of a method it hasn't 
> seen before (e.g. PROPPATCH request and response bodies are similarly 
> uncached in practice).  An intermediary can apply transfer-encodings 
> but I don't see how that could be harmful.  For example, an 
> intermediary could apply chunked transfer encoding to a PATCH body in 
> either the PATCH -05 proposal or the PATCH -06 proposal without harm.  
> Or am I missing something?  What else can an intermediary do and how 
> might it be harmful?

It is harmful because it adds yet another silly exception to a
standard that is already burdened by haphazard design add-ons,
because it would violate what is an unambiguous Draft Standard
protocol definition for how to interpret *any* message as described
in section 7.2.1, and because an entirely new set of status codes
would have to be defined to mean "IM not acceptable" instead of
"media type not acceptable".

> I did realize one advantage of the PATCH with IM approach, and that is 
> to create a new resource of a specified type.  For example:
>
>        PATCH /file.ics HTTP/1.1
>        Host: www.example.com
>        Content-Type: text/calendar;component=VEVENT
>        IM: vcdiff
>        If-Match: "e0023aa4e"
>        Content-Length: 100
>
>        [vcdiff-bytes]
>
> >>Response:
>
>        HTTP/1.1 201 Created
>        ...
>
> This would create a new file of type text/calendar.  With the PATCH 
> -05 proposal, there wasn't a way to use PATCH to create a new resource 
> and assign the correct MIME type to begin with.

Sure there was.  First, it may already be set by virtue of the URI
chosen for the PATCH.  Second, it can be set by defining a patch
format that has a section for assigning metadata like Content-Type,
along with any other dead properties.  That way, we don't have to
create special-purpose request header fields for every property that
a client might want to set while creating/modifying a resource.

> So what's the consensus here or do we need more light on these issues 
> before we can reach a consensus?  Do any implementors have 
> implementation considerations related to this issue, and are there 
> realistic intermediary problems as I already asked?  How much of a 
> religious war is it?    I'm not personally religious on this, I'm 
> caught between the duelling theories and I'd just like to understand 
> the practical considerations better.

It isn't duelling theories.  Look at RFC 3229 again, please, and you
will note that Content-Type is NEVER sent in a 226 response.  It can't
be sent in a 226 response because that would violate the messaging model
of HTTP, which overrides anything less than a standards-track document
that explicitly Updates RFC 2616.  RFC 3229 is not being implemented
because it is too complex, almost entirely because of its misinformed
interpretation of what content-codings mean in HTTP.  If it hadn't
tried to play file-games with "instance" encodings and simply stuck
to the already defined representation model, delta-encoding would
have been a trivial extension to HTTP (just as PATCH should be a
trivial variation on PUT).

....Roy
Received on Monday, 18 October 2004 23:18:59 UTC