- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Thu, 26 Jun 2025 00:49:52 -0700
- To: Mark Nottingham <mnot@mnot.net>
- Cc: ietf-http-wg@w3.org, draft-ietf-httpbis-safe-method-w-body.all@ietf.org
> On Jun 21, 2025, at 9:57 PM, Mark Nottingham <mnot@mnot.net> wrote: > > Responding to a couple of points as an individual (and note that I'm travelling and so can't send a complete, considered response) -- > >> On 21 Jun 2025, at 12:08 am, Roy Fielding via Datatracker <noreply@ietf.org> wrote: >> >> However, the technology being described fails to meet the >> basic architectural requirements for the Web and HTTP. >> "All important resources are identified by a URI" is the >> primary design principle of the Web. The entire system depends >> on it for linkability and scale. > > "Important" is carrying a lot of weight in that sentence. QUERY is intended to address cases where POST and GET-with-body are being used, so on the face of it this property isn't at risk as compared to current usage. If GETable resources suddenly switched to using QUERY I would be concerned, but there isn't an immediately apparent reason for that to start happening. I don't think the property is at risk. It's just a dependency for the rest of the design, in the sense that identification promotes "more Web", more uniformity of interface, and more caching. Hence, a URl is encouraged instead of promoting less effective interfaces that are resource-specific and hidden within request bodies. This does not imply that people aren't going to build systems with HTTP that don't work well with the Web. They do that already. We just don't encourage them to do so as standards. IOW, I am not objecting to the idea of supporting QUERY as a "GET with a body" or a "POST with idempotence". That's fine. What I object to is the suggestion that QUERY might be cacheable without producing a URI, or that intermediaries might be encouraged to read an entire request body and canonicalize it on the off-chance that it might be a reusable, previously cached query. It's one step too far into the abyss. In contrast, we already have a safe and Web-positive mechanism to achieve caching with a Location-provided URI. >>> 2.4. Caching >>> >>> The response to a QUERY method is cacheable; a cache MAY use it to >>> satisfy subsequent QUERY requests as per Section 4 of >>> [HTTP-CACHING]). >> >> No, just no. A cache does not have access to the request content when >> making a hit/miss decision. Use the 303 response, as designed. > > That's arbitrary -- HTTP caches can (and have) been written to buffer the request until all content is received. E.g., people are already doing non-standard POST caching now. I've seen that in origin servers and gateways, not in caches. For example, they are different phases in typical request handling for Apache or VCL and the request data is streamed, not buffered. It can certainly be implemented that way by an origin (or on a CDN by origin config). >> The reason why this is not allowed in HTTP is because routing decisions >> are based on the connection context, host, and entire target URI. > > This is caching, not routing, and caching with content negotiation already requires access to the request headers. Sorry, I forgot to include request header fields as well. I meant that the content within the request message body does not participate in routing because it comes too late. Routing determines what servers (and caches) see the request. And you can ignore this bit because I was talking about the wrong thing (see below). >> A cache cannot know what parts may apply. The origin doesn't know either. > > Parts? The parts above (context, host, entire target URI, and header fields). An upstream might route the request to a different node/server/CDN/whatever based on any of those things along the path of request processing, according to whatever security configuration it might have been given. Each routing decision is based on the request characteristics. If a recipient changes those characteristics, for example by moving query parameters from the content to the target URI and responding as if that was the request received, then it bypasses whatever potential configuration might have been associated with those parameters had they been in the original target URI. Anyway, I apparently I lost track of the context at this point because I was arguing against moving the query information into the cache key while thinking that would be reused as a cache for GET with query, whereas the draft clearly specifies it would be a cache for subsequent QUERY requests having the same (normalized) key. I've got no idea how I jumped the tracks on that one. While I was trying to figure that out, I also noticed > 2.5. Range Requests > > The semantics of Range Requests for QUERY are identical to those for > GET, as defined in Section 14 of [HTTP]. which I think should be rephrased as "QUERY defines range handling ..." in a specific way that differs only slightly from how it is defined for GET. I can convince myself that 9110's description of selected representation for content negotiation is broad enough to support Range on QUERY, but 14.2 is very specific about only defining GET. Likewise, if we assume the query results are the selected representation then Etag and Last-Modified can make sense to supply values for If-Range. However, the other conditionals would have to be redefined for QUERY in the same way that they are specially defined for GET (i.e., defined as a precondition for sending the response rather than as a precondition on performing the method). This would be an update to RFC9110, sec 13.2. OTOH, this is again duplicating the role of GET in HTTP, and IMO is simply not worth the interface cost. If QUERY can just define the simplest exchange with an optional Location, then all of this complexity regarding caches and ranges is deferred to the subsequent GET (already defined and implemented for HTTP). ....Roy
Received on Thursday, 26 June 2025 07:50:11 UTC