Re: [whatwg/url] Review slashes after file:/// and file path normalization (#405)

This turns out to be a very complicated problem. 

The only viable approach I've been able to come up with is to not ignore multiple slashes in file URLs. At least this agrees with Safari and Firefox and the RFCs, and it prevents messy interactions with #302 that break idempotency. (It even agrees with IE/Edge in some cases). 

Then for #302 I would propose to preserve the host in file URLs (also in the presence of drive letters) except for localhost, which would be replaced with the empty string. That then disagrees with Firefox but is more in line with the others (considering just the host). 

Together these changes will cause quite a bit of disagreement though. But I don't see any better alternative. 

Is this clear enough? I can have a look at the state machine to see where to make the changes, but I'd like to be sure that there's some consensus about the approach first. 

* * *

Somewhat off-topic, and I'm hesitant to mention this, but I decided to go ahead because it could be useful for the standardisation process. 

I have tried to build an URL library that is compatible with the standard, but has separate parsing, resolving and normalisation phases. It uses a simple 'theory' of URLs, modelling them as a sequence of tokens. They're similar, but different from the parse trees as per the RFCs. I wrote it down in the Theory section of the [README][1].

It can be useful to describe the different browser behaviours as operations on such token lists. And in general it could help with points 4 and 5 that @domenic mentions above, amongst other things. – Just in case. 

[1]: https://github.com/alwinb/reurl



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/405#issuecomment-694786491

Received on Friday, 18 September 2020 10:21:56 UTC