Re: Proposed HTTP SEARCH method update from Phil Hunt on 2015-05-17 (ietf-http-wg@w3.org from April to June 2015)

From: Phil Hunt <phil.hunt@oracle.com>
Date: Sun, 17 May 2015 08:46:44 -0700
To: "henry.story@bblfish.net" <henry.story@bblfish.net>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-Id: <26446D48-EA4C-4765-89C6-3D586118D009@oracle.com>
The outcomes are somewhat different or at least critical. In this case, if a header or body is stripped, bad things happen. Neither the server or the client are aware. The servers default action is to process the get (now plain). 

For example if a proxy strips the header, the server never applies the filter and the client assumes everything returned is a match. 

This would be bad if the search filter was a security filter for example. Does resource henry123 have scope=update or role=admin? 

Phil

> On May 17, 2015, at 01:37, "henry.story@bblfish.net" <henry.story@bblfish.net> wrote:
> 
> 
>> On 11 May 2015, at 23:16, Phil Hunt <phil.hunt@oracle.com> wrote:
>> 
>> Henry,
>> 
>> Authors of the HTTP specs are advising passing a body with GET is a very bad idea. This is mostly for historical reasons.
>> 
>> Julian Reschke also felt that the use of headers for queries was also ill-advised given that a lot of firewalls may manipulate or block headers. The concern is unanticipated consequences.  E.g. the nature of the GET changes because the header or the body is missing and the server mistakenly releases information because the operation becomes treated as a plain get.  E.g. the return of a resource was to be subject to some filter condition and the client is fooled into thinking the condition was true.
> 
> This is not an issue for the proposal. The semantics I am proposing allows the server to ignore the range request and the body without harm to the request.
> 
> That can be addressed the same way HTTP 1.1 (RFC 2616 ) addresses partial GET.
> In §9.3
> 
> [[
>   The semantics of the GET method change to a "partial GET" if the
>   request message includes a Range header field. A partial GET requests
>   that only part of the entity be transferred, as described in 
>   section 14.35. The partial GET method is intended to reduce unnecessary
>   network usage by allowing partially-retrieved entities to be
>   completed without transferring data already held by the client.
> ]]
> 
> So we need to include a Range header field or equivalent.
> And is written clearly in §14.35.2 "A server MAY ignore the Range header. "
> In the same section we also see that:
> 
> [[
>   If the server supports the Range header and the specified range or
>   ranges are appropriate for the entity:
> 
>      - The presence of a Range header in an unconditional GET modifies
>        what is returned if the GET is otherwise successful. In other
>        words, the response carries a status code of 206 (Partial
>        Content) instead of 200 (OK).
> 
>      - The presence of a Range header in a conditional GET (a request
>        using one or both of If-Modified-Since and If-None-Match, or
>        one or both of If-Unmodified-Since and If-Match) modifies what
>        is returned if the GET is otherwise successful and the
>        condition is true. It does not affect the 304 (Not Modified)
>        response returned if the conditional is false.
> 
>   In some cases, it might be more appropriate to use the If-Range
>   header (see section 14.27) in addition to the Range header.
> ]]
> 
> What I am proposing is intended to be consistent with this.
> I am proposing to be guided by the same constraints imposed by HTTP here:
> 
> A request with a body that is ignored should return the full content - and
> it does on all correctly implemented servers. You can try it :-)
> 
> For servers that understand the query language range request, that still needs
> to be defined more clearly, the server would also return a 206.
> 
> This has the advantage of imposing very RESTful constraints on the 
> query language. It makes clear that the relation between the GET with the
> query and the normal GET is just that the query is a partial representation
> of the first. 
> 
> What I am proposing is just an extension of partial content, with a Query 
> language to be able to make more precise selections in the content.
> 
> I think this can satisfy a very large number of use cases for which the 
> SEARCH verb was proposed.
> 
>> 
>> There are also (whether founded or not) concerns about space limitations in headers and URLs. Though I have to wonder, if your filter is that long, should the filter be processed?  E.g. why pass a public key when you can pass its thumbprint.
> 
> passing the query in the header was a fallback position, over putting one in the body.
> 
>> By using a new HTTP method, then the failure due to lack of support for the method becomes guaranteed. A server that does not support or does not get the request (because of a firewall) will not unintentionally continue processing the request.  
> 
> The HTTP GET query proposal is not exclusive of a good SEARCH or QUERY verb proposal.
> 
> For example SEARCH/QUERY would still useful in case the server wants to make clear that the resource is too big to do a GET on, and that only partial content queries should be used.
> Here the server has a good reason to make failure of a GET the default.
> 
>> 
>> Phil
>> 
>> @independentid
>> www.independentid.com
>> phil.hunt@oracle.com
>> 
>>> On May 10, 2015, at 5:04 AM, henry.story@bblfish.net wrote:
>>> 
>>> 
>>>> On 1 May 2015, at 18:12, Phil Hunt <phil.hunt@oracle.com> wrote:
>>>> 
>>>> One of the use cases I am having difficulty is what I will call “MATCH”.  This is a bit different from doing a system wide search as much of the discussion has focused on.  In this case, we want to test if a certain filter condition is true for a given resource and optionally return some attributes (or nothing at all).
>>> 
>>> yep .
>>> 
>>>> 
>>>> We talked about this in developing the SCIM protocol.  We decided to go with the POST option since a SEARCH method would be long term (C below).
>>>> 
>>>> I’ve looked at a number of approaches on this case:
>>>> 
>>>> A. A GET is inappropriate because the filter condition cannot be passed on the URL.  It would be ok if it could be passed in the BODY, but I hear that’s bad.
>>> 
>>> There are two possibilities here, both allowed by the current spec.
>>> 1) GET with Query: and Query-Type: header ( the second one giving the mime type of the header
>>> 2) GET with Content-Type header and body.
>>> 
>>> There have been no criticisms of (1), only of (2) . 
>>> (2) does not go against any of the more recent specs - it is even allowed, and probably works for most servers. It should be left open for the longer term after studies if it really is problematic and in what circumstances. But 1) is a good fallback position.
>>> 
>>> The fallback position of both is that if the server does not understand the headers or body, it returns the value it would have returned on a plain GET. This actually works: you can try it on any web server to see :-)
>>> 
>>> The main problem with a GET is when the resource is to all intents and purposes infinitely long and one really never wants to do a GET on the resource. In that case it is important to be able to run SEARCH ( or some other verb such as QUERY). Could it be that some resource just offer SEARCH/QUERY and not GET in that case?
>>> 
>>> 
>>>> B. A POST is problematic because it should be to create or replace the resource.  How does the server detect the intended operation?
>>>> 
>>>> C. A POST + a path extension (e.g., “/.search” giving   https://example.com/Users/<id>/.search) is a work around works, but seems kludgy.  Access control systems have to know that https://example.com/Users/<id>/.search is the same resource as https://example.com/Users/<id>
>>>> 
>>>> D. A POST to /search can also work, but makes for a more complex operation as now we are not talking about an operation based on a URL.  It also may complicate access control since the resource being queried isn’t part of the URL.
>>>> 
>>>> I recall talking to Julian about if-match headers…I can’t remember the pros/cons we talked about. But I do think that the semantics shouldn’t change. I can see adding if-match conditions to a SEARCH command as being useful. I think overloading these might be more complexity.
>>>> 
>>>> I think this is a case that more clearly shows why “SEARCH” is useful. 
>>>> 
>>>> Phil
>>>> 
>>>> @independentid
>>>> www.independentid.com
>>>> phil.hunt@oracle.com
> 
> Social Web Architect
> http://bblfish.net/
>
Received on Sunday, 17 May 2015 15:47:22 UTC