Re: HTTP request validation guidelines for implementers

On Friday, July 9th, 2021 at 04:18, Rafal Pietrak <cookie.rp@ztk-rp.eu> wrote:
> Hi,
>
> I'm also very-very new to the list, and on one occasion I was kindly
>
> reminded, that my "wisdom" is pointless to share.
>
> Despite that, IMHO:
>
> 1.  form validation belongs to an application level, not HTTP level. A
>
>     "form" is a payload of HTTP.
> 2.  HTTP error codes 4xx are exactly that - HTTP level error codes.
>
>     Mixing design levels is not a good thing.
> 3.  within the form, the decision of what is valid and what is not is
>
>     very application dependent. In particular, fields that go into the SQL
>
>     RDBMS will be validated against cooked SQL syntax. But should an
>
>     application use another database with different query syntax,
>
>     form-fields may require quite different validation.
> 4.  then, an application level programmer may choose to gather all
>
>     possible inputs (for attack vector detection) or he/she may choose to
>
>     file just those "meaningful". So recommending what SHOULD or what MUST
>
>     be validated can be hard to determine for general purpose.
>
>     Then again. Should you have a clear view of that should be defined as
>
>     standard SHOULD/MUST regarding field validation, you may put that in
>
>     writing as "draft-proposal", for everybody to see how such
>
>     recommendations could fit their work.
>
>     With best regards,
>
>     -R
>
> Rafał Pietrak

I mostly agree with you in regards to form validation, but I am not reffering
to form key and value validation for the most part here. I am talking about
validating HTTP request (sub)components like:

0. Connection parameters
- Remote address
- TLS version, cipher suite, etc.

1. Start-line
- Method
- Path
- Protocol major and minor version

2. Headers
- Header key
- Header value

3. Body

4. Trailers
- Trailer key
- Trailer value

For each request (sub)component you can perform _at least_ the following
validation checks:
- Nullity checks: (in)existence of certain request (sub)components
- Numeric comparison checks: equality and range checks of request
(sub)components count and length/size.
- String comparison checks: equality, prefix, suffix, substring, character
boundary, regex, etc.
- Rate-limiting checks (429)

For instance, today a well configured web server may check:
- TLS settings (i.e. ensure you're using TLS 1.2)
- Request rate of your IP and/or the current method+path
- Request URI length
- Payload size
- If the current method is supported at all in any path of the service
(before request routing).
- If the path exists
- If the Host header is unique and matches the host itself
- If you're using trailers which the sever does not support
- If you're authenticated and authorized for this action
- ...

Today, the problem is that there's no clear recommendation of which validation
checks MUST/SHOULD be made on which request (sub)components and in what order
those checks SHOULD be made (i.e. if both your body size and request URI length
are larger than what the server is willing to process, which one of these should
the server check for first and return the appropriate HTTP error status code).


On Friday, July 9th, 2021 at 07:42, Julian Reschke <julian.reschke@gmx.de> wrote:

> Am 08.07.2021 um 20:18 schrieb João Penteado:
>
> I agree that (1) is an interesting question, and in a perfect world the
>
> spec will state that. If it does not, we may want to improve it.
>
> With respect to (2) I'm very sceptical: ordering of validation steps
>
> often depends on implementation details. Why does it matter in practice?
>
> (I understand that it would help in comparing validators, but besides
>
> that???)
>
> Best regards, Julian

There're two main benefits of recommending a validation step order in the spec:

1. With the appropriate research/tests made with each type of check, we could
determine the optimal check order performance and security-wise, making the
implementer's job much easier.

2. If the most servers out there adopt the same validation order, clients will
gain additional information unavailable before. If, for instance, every server
checks URI length before checking payload size, and I get a "413 Request Entity
Too Large" error, I would know for sure that my URI length is fine and all the
previous checks passed successfully.

Best regards,

João Penteado

Received on Friday, 9 July 2021 17:52:18 UTC