- From: Karl <notifications@github.com>
- Date: Mon, 30 Aug 2021 08:20:19 -0700
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/634@github.com>
It is important that users understand the difference between hierarchical/non-hierarchical URLs, as many operations will fail if a URL is non-hierarchical. However, the term currently used in the spec, 'cannot-be-a-base', is not particularly intuitive. Consider trying to document something like the `hostname` setter: > This operation fails if the URL cannot be a base > This operation is only valid if the URL's cannot-be-a-base flag is false It doesn't read well, and it poses an obvious question: which URLs cannot be a base? Does this affect me? How can I know? Slashes after the scheme delimiter isn't really helpful at a semantic level - it describes how the parser decides whether a URL is hierarchical, but doesn't help users predict whether they should expect to deal with non-hierarchical URLs in their applications. Or something like a `pathComponents` view: > cannot-be-a-base URLs do not have path components > Only base URLs have path components It's all quite awkward and opaque, IMO. However, if we rephrased these in terms of hierarchical/non-hierarchical URLs, it all becomes a lot clearer: `hostname` setter: > This operation fails if the URL is non-hierarchical > Only hierarchical URLs may have hostnames `pathComponents`: > Non-hierarchical URLs do not have path components > Only hierarchical URLs have path components It also makes it a lot easier for libraries to expose the cannot-be-a-base property to users, and explain what it means: a hierarchical URL has an authority component or hierarchical path (i.e. a path beginning with a "/"), also known as a [`hier-part`](https://datatracker.ietf.org/doc/html/rfc3986#section-3) in RFC-3986 terminology. Suddenly it makes sense why the setters of authority components fail on a non-hierarchical URL. Personally, I think it's better when implementations don't invent their own concepts and are able to refer directly to ideas described in the standard. So, since I think this is something implementations may want to expose to users, I'd like to incorporate this terminology in to the standard. We can keep the term 'cannot-be-a-base' around and simply define it as a synonym for a non-hierarchical URL. What do people think? This will likely involve a bunch of churn without changes to functionality, so I'd like to get some feedback before working on a PR. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/634
Received on Monday, 30 August 2021 15:20:32 UTC