Re: draft-montenegro-httpbis-uri-encoding

On 2014-03-21 11:55, Nicolas Mailhot wrote:
>
> Le Ven 21 mars 2014 09:13, Julian Reschke a écrit :
>> On 2014-03-21 08:59, Nicolas Mailhot wrote:
>>> ...
>>>> My concerns are the same as when this was presented first: how does
>>>> this
>>>> help?
>>>>
>>>> I hear that it makes security checks more reliable, but then, you can't
>>>> rely on the header field being accurate
>>>
>>> There is a difference between working in heuristics mode all the time
>>> with
>>> crossed fingers and rabbit legs and working in deterministic mode with
>>> simple error handling (and error handling can be abort when what the
>>> other
>>> node declares and what you receive are different – much more secure than
>>> generalized guesswork)
>>
>> So how *exactly* does the header field help you in deciding whether to
>> be in heuristics mode or not?
>
> If I know the encoding is supposed to be UTF-8 I can fail anything that
> does not pass the usual UTF-8 sanity rules instead of iterating through
> encodings hoping I find the right one before hitting a security bug

So what do you do once you actually get requests that claim to use UTF-8 
but don't? Investigate what's wrong?

> If I know my app is open to European users only I can block user agents
> that try to use a CJK encoding before they manage to hit a CJK processing
> bug

I don't think any CJK-related is on the table.

> I don't have to log random ascii-ified percentage garbage in the hope that
> the day it will need to be analysed and interpreted I'll have enough human
> beings to review it line by line and assign the encoding necessary to
> interpret it line by line.

That seems to be the same use case as #1.

Why don't you just try to UTF-8 decode, and if that works, assume that 
it indeed is UTF-8?

Best regards, Julian

Received on Friday, 21 March 2014 11:02:24 UTC