Re: draft-ietf-httpbis-safe-method-w-body-11 early Httpdir review

> On Jun 21, 2025, at 9:57 PM, Mark Nottingham <mnot@mnot.net> wrote:
> 
> Responding to a couple of points as an individual (and note that I'm travelling and so can't send a complete, considered response) --
> 
>> On 21 Jun 2025, at 12:08 am, Roy Fielding via Datatracker <noreply@ietf.org> wrote:
>> 
>> However, the technology being described fails to meet the
>> basic architectural requirements for the Web and HTTP.
>> "All important resources are identified by a URI" is the
>> primary design principle of the Web. The entire system depends
>> on it for linkability and scale.
> 
> "Important" is carrying a lot of weight in that sentence. QUERY is intended to address cases where POST and GET-with-body are being used, so on the face of it this property isn't at risk as compared to current usage. If GETable resources suddenly switched to using QUERY I would be concerned, but there isn't an immediately apparent reason for that to start happening.

I don't think the property is at risk. It's just a dependency for the rest of the design,
in the sense that identification promotes "more Web", more uniformity of interface,
and more caching. Hence, a URl is encouraged instead of promoting less effective
interfaces that are resource-specific and hidden within request bodies.

This does not imply that people aren't going to build systems with HTTP that
don't work well with the Web. They do that already. We just don't encourage
them to do so as standards.

IOW, I am not objecting to the idea of supporting QUERY as a "GET with a body"
or a "POST with idempotence". That's fine. What I object to is the suggestion
that QUERY might be cacheable without producing a URI, or that intermediaries
might be encouraged to read an entire request body and canonicalize it
on the off-chance that it might be a reusable, previously cached query.

It's one step too far into the abyss.

In contrast, we already have a safe and Web-positive mechanism to achieve
caching with a Location-provided URI.

>>> 2.4.  Caching
>>> 
>>>  The response to a QUERY method is cacheable; a cache MAY use it to
>>>  satisfy subsequent QUERY requests as per Section 4 of
>>>  [HTTP-CACHING]).
>> 
>> No, just no. A cache does not have access to the request content when
>> making a hit/miss decision. Use the 303 response, as designed.
> 
> That's arbitrary -- HTTP caches can (and have) been written to buffer the request until all content is received. E.g., people are already doing non-standard POST caching now.

I've seen that in origin servers and gateways, not in caches. For example,
they are different phases in typical request handling for Apache or VCL
and the request data is streamed, not buffered. It can certainly be
implemented that way by an origin (or on a CDN by origin config).

>> The reason why this is not allowed in HTTP is because routing decisions
>> are based on the connection context, host, and entire target URI.
> 
> This is caching, not routing, and caching with content negotiation already requires access to the request headers.

Sorry, I forgot to include request header fields as well. I meant that the
content within the request message body does not participate in routing
because it comes too late. Routing determines what servers (and caches)
see the request. And you can ignore this bit because I was talking about
the wrong thing (see below).

>> A cache cannot know what parts may apply. The origin doesn't know either.
> 
> Parts?

The parts above (context, host, entire target URI, and header fields).
An upstream might route the request to a different node/server/CDN/whatever
based on any of those things along the path of request processing,
according to whatever security configuration it might have been given.

Each routing decision is based on the request characteristics.
If a recipient changes those characteristics, for example by moving
query parameters from the content to the target URI and responding
as if that was the request received, then it bypasses whatever potential
configuration might have been associated with those parameters had
they been in the original target URI.

Anyway, I apparently I lost track of the context at this point because
I was arguing against moving the query information into the cache key
while thinking that would be reused as a cache for GET with query,
whereas the draft clearly specifies it would be a cache for subsequent
QUERY requests having the same (normalized) key.

I've got no idea how I jumped the tracks on that one.

While I was trying to figure that out, I also noticed

> 2.5.  Range Requests
> 
>    The semantics of Range Requests for QUERY are identical to those for
>    GET, as defined in Section 14 of [HTTP].

which I think should be rephrased as "QUERY defines range handling ..."
in a specific way that differs only slightly from how it is defined
for GET. I can convince myself that 9110's description of selected
representation for content negotiation is broad enough to support
Range on QUERY, but 14.2 is very specific about only defining GET.

Likewise, if we assume the query results are the selected representation
then Etag and Last-Modified can make sense to supply values for If-Range.
However, the other conditionals would have to be redefined for QUERY in
the same way that they are specially defined for GET (i.e., defined as
a precondition for sending the response rather than as a precondition
on performing the method). This would be an update to RFC9110, sec 13.2.

OTOH, this is again duplicating the role of GET in HTTP, and IMO is
simply not worth the interface cost. If QUERY can just define the
simplest exchange with an optional Location, then all of this
complexity regarding caches and ranges is deferred to the subsequent
GET (already defined and implemented for HTTP).

....Roy

Received on Thursday, 26 June 2025 07:50:11 UTC