Re: Invalid Characters in URLs

On Sat, 21 Sept 2024 at 02:24, Carsten Bormann <cabo@tzi.org> wrote:

> On 2024-09-20, at 17:25, Daniel Stenberg <daniel@haxx.se> wrote:
> >
> > On Fri, 20 Sep 2024, Tim Bray wrote:
> >
> > So, when we write a parser today, do we want to parse the URLs that are
> in active use out there, or do we want to be a purist and tell the users
> they are wrong when they provide URLs that the browsers are fine with?
>
> Again, you can write a tool that happily accepts http:\\ “URLs” etc.
> But you can’t impose that lenience on other tools, and we are not obliged
> to spend the same amount of energy in our tools that a browser does on
> assigning interpretations to invalid URIs.


As a server developer, a situation that is happening with increasing
frequency is that we receive CVE's against our server because we have a
different interpretation of URIs than the common browsers.       We
implement the RFC as written, but the browsers mostly follow WhatWG, so
there are differences, especially around invalid or deprecated aspects of
URIs (e.g. an authority with user info).    If a server interprets a URI
flexibly and differently to the browsers, then security researchers ping
you with potential security bypass vulnerabilities.

We are thus forced to either ignore the RFC and adopt WhatWGs spec, or to
just strictly follow the RFC and show no flexibility at all (400 bad
requests for the slightest violation).

So I do not think leniency is a way out of this mess.

cheers

-- 
Greg Wilkins <gregw@webtide.com> CTO http://webtide.com

Received on Friday, 20 September 2024 23:00:53 UTC