[whatwg/url] [Editorial] Replace the term 'cannot-be-a-base' with hierarchical/non-hierarchical (#634)

It is important that users understand the difference between hierarchical/non-hierarchical URLs, as many operations will fail if a URL is non-hierarchical.

However, the term currently used in the spec, 'cannot-be-a-base', is not particularly intuitive. Consider trying to document something like the `hostname` setter:

> This operation fails if the URL cannot be a base

> This operation is only valid if the URL's cannot-be-a-base flag is false

It doesn't read well, and it poses an obvious question: which URLs cannot be a base? Does this affect me? How can I know? Slashes after the scheme delimiter isn't really helpful at a semantic level - it describes how the parser decides whether a URL is hierarchical, but doesn't help users predict whether they should expect to deal with non-hierarchical URLs in their applications.

Or something like a `pathComponents` view:

> cannot-be-a-base URLs do not have path components

> Only base URLs have path components

It's all quite awkward and opaque, IMO.

However, if we rephrased these in terms of hierarchical/non-hierarchical URLs, it all becomes a lot clearer:

`hostname` setter:

> This operation fails if the URL is non-hierarchical

> Only hierarchical URLs may have hostnames

`pathComponents`:

> Non-hierarchical URLs do not have path components

> Only hierarchical URLs have path components

It also makes it a lot easier for libraries to expose the cannot-be-a-base property to users, and explain what it means: a hierarchical URL has an authority component or hierarchical path (i.e. a path beginning with a "/"), also known as a [`hier-part`](https://datatracker.ietf.org/doc/html/rfc3986#section-3) in RFC-3986 terminology. Suddenly it makes sense why the setters of authority components fail on a non-hierarchical URL.

Personally, I think it's better when implementations don't invent their own concepts and are able to refer directly to ideas described in the standard. So, since I think this is something implementations may want to expose to users, I'd like to incorporate this terminology in to the standard. We can keep the term 'cannot-be-a-base' around and simply define it as a synonym for a non-hierarchical URL.

What do people think? This will likely involve a bunch of churn without changes to functionality, so I'd like to get some feedback before working on a PR.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/634

Received on Monday, 30 August 2021 15:20:32 UTC