Re: [whatwg/url] How should "everything after the scheme" URLs work? (#385)

I'm puzzling over (my) characterisation of the WHATWG resolution and this issue came to mind. 
Some observations, in case it helps.  

Let's look at the properties of parsed/ resolved URLs:

- File URLs have an authority (may be empty) and an absolute path (may be just `/` or just a drive letter).
- Other special URLs have a non-empty authority and an absolute path (may be just `/`).

These properties are natural consequences of the protocols. 

For non-special URLs the parser/ resolver uses the 'cannot-be-a-base-url' flag to decide if the URL is a base URL. This amounts to the following:

- If a non-special URL has an authority, or a path that starts with `/` then it is used as a base-URL. 

So `javascript:foo` is not considered a base URL, but `javascript:/foo` and `javascript://` are. 
Note that `foo` against `javascript:` throws an error whereas `foo` against `javascript:/` results in `javascript:/foo`. 


I think it makes sense to define what is and what is not a base URL, based on the protocol only. 
The protocol would then select one of the following options:

1. An authority and an an absolute path (file URLs)
2. A nonempty authority and an absolute path (other special URLs)
3. (a, b) An absolute path, or an absolute path or an authority (some subset of current non-special URLs)
4. No authority and opaque path. (such as `javascript:` URLs)

That requires a hardcoded list of protocols and their associated URL 'type' (ie. parsing/resolving behaviour) though. 
It could also be useful to provide a way to manually register protocols to map to a certain parsing/resolving behaviour. 

Just some ideas. 

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/385#issuecomment-870649874

Received on Tuesday, 29 June 2021 14:27:21 UTC