Re: HTTP request validation guidelines for implementers from Rafal Pietrak on 2021-07-09 (ietf-http-wg@w3.org from July to September 2021)

From: Rafal Pietrak <cookie.rp@ztk-rp.eu>
Date: Fri, 9 Jul 2021 09:18:23 +0200
To: ietf-http-wg@w3.org
Message-ID: <de67ac60-b09b-3bec-e7b2-2bdbe94254ac@ztk-rp.eu>
Hi,

I'm also very-very new to the list, and on one occasion I was kindly
reminded, that my "wisdom" is pointless to share.

Despite that, IMHO:
1. form validation belongs to an application level, not HTTP level. A
"form" is a payload of HTTP.
2. HTTP error codes 4xx are exactly that - HTTP level error codes.
Mixing design levels is not a good thing.
3. within the form, the decision of what is valid and what is not is
very application dependent. In particular, fields that go into the SQL
RDBMS will be validated against cooked SQL syntax. But should an
application use another database with different query syntax,
form-fields may require quite different validation.
4. then, an application level programmer may choose to gather all
possible inputs (for attack vector detection) or he/she may choose to
file just those "meaningful". So recommending what SHOULD or what MUST
be validated can be hard to determine for general purpose.

Then again. Should you have a clear view of that should be defined as
standard SHOULD/MUST regarding field validation, you may put that in
writing as "draft-proposal", for everybody to see how such
recommendations could fit their work.

With best regards,

-R

W dniu 08.07.2021 o 20:18, João Penteado pisze:
> Greetings everyone,
> 
> This is my first time posting on a IETF mailing list, so I apologize in advance
> if this isn't the proper channel for this discussion.
> 
> Input validation is a critical security component of most information systems,
> and this is especially true for public services available over a untrusted
> network such as the Internet.
> 
> In the context of HTTP requests, there're several opportunities for early
> request validation and many systems employ an dedicated initial validation layer
> either internally, as "request middleware", or externally in the form of layer 7
> firewalls, AKA Web Application Firewalls (WAFs), or reverse proxies.
> 
> It is usually in both the server's and the client's best interest that an
> invalid request be rejected as soon as possible and that meaningful error
> information be sent back to the client. For this very purpose, RFC7231 defines a
> whole class of HTTP error status code (4XX).
> 
> However, when implementing an HTTP server, there is no standardization or
> consolidated guideline that I could find that address these two points:
> 
> 1. Which components of the request MUST and SHOULD be validated.
> 2. In which order SHOULD these components be validated.
> 
> There're various flowcharts[1][2] available online that propose different
> approaches of which components of the request should be validated first and
> which HTTP status should have a "higher priority" when a request has multiple
> errors. For instance, if the both Body and the URI exceed the maximum size
> allowed by the server and client also exceeds its agreeded rate limit, should
> the response code be 413 (Payload Too Large), 414 (Request-URI Too Long) or 429
> (Too Many Requests).
> 
> As I mentioned earlier, currently, as it is in the server's best interest to
> reject an invalid request as soon as possible, the "higher priority" status code
> usually is one whose error is found earlier in the validation chain
> implementation.
> 
> Should a standard or best practice recommendation for the points above be
> adopted, clients that receive a client error response code will be able to tell
> which checks were successful and servers will have a clear recommendation to
> validate certain aspects of the requests they receive, which is still
> unfortunately overlooked by many in the wild. It would also reduce inconsistent
> behavior among different implementations of a same service that may return
> different error codes for the same given request.
> 
> As someone who is currently implementing such a validator, I would like to start
> this discussion proposing that the order of such an evaluation chain takes into
> consideration the following factors, in priority order:
> 
> 1. The computational complexity of the validation check.
> 2. The specificity of the returned status code (i.e. checks that when failed
> return a generic "400 Bad Request" should be latter in the validation chain).
> 4. The order in which the (sub)component appears in the request (start-line,
> headers and body). Null, size, length and boundary range checks should probably
> come first.
> 
> The same factors should also apply when evaluating subcomponents of the request,
> such as individual headers or body parts.
> 
> Finally, additional security concerns, like the classic 404 vs 403 unintentional
> information disclosure dilemma, should also probably be addressed in such a
> document.
> 
> I'd love to get the input of this working group on the relevance and topics I
> raised above.
> 
> Best regards,
> 
> João Penteado
> 
> 
> External links
> 
> [1] https://www.loggly.com/blog/http-status-code-diagram/
> [2] https://www.codetinkerer.com/2015/12/04/choosing-an-http-status-code.html
> 
> 

-- 
Rafał Pietrak
Received on Friday, 9 July 2021 08:11:17 UTC