Re: [whatwg/url] Addressing HTTP servers over Unix domain sockets (#577)

@robin-aws
> At the same time, extending it to only support socket paths as well (or at least a path-like value) feels like it fails to prevent future instances of the same problem for other host schemes.

Please provide or contrive an example.

Also, please clarify - the words "host" and "scheme" are specific and distinct terms-of-art as used in the RFC 3986 definition of a "URI", where "host" is a component of the "authority".  By "host scheme", did you mean a "host", or a "scheme" - or something else?

You can find a list of registered "schemes" under "Uniform Resource Identifier (URI) Schemes" at: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml

We may note that many of the "schemes" registered there, as seen in their description templates, define no "authority" component, and consist of only a "scheme" and "path".  Obviously, without an "authority", there would be no "port".  Even a "localhost" AF_UNIX URI  would have to specify a "host" component, which is not optional if there is an "authority" in the URI, as for instance, literally "localhost".  But then, either a "scheme" needs an "authority" - or it doesn't.

Nonetheless, any RFC 3986 defined URI is going to use the exact same definition of "authority", as defined in the RFC.  The "authority" does not change with - is not a function of - the type of "scheme" selected.  These are two distinct things, the "scheme" and the "authority", within the RFC.

I have not even proposed changing the definition of the "authority".  I have only proposed extending the definition of the optional "port" component of the "authority", to allow for an AF_UNIX "path", as distinct from the "url-path" or "urlpath", as it was termed originally in RFC 1738:
```
url-path
        The rest of the locator consists of data specific to the
        scheme, and is known as the "url-path". It supplies the
        details of how the specified resource can be accessed. Note
        that the "/" between the host (or port) and the url-path is
        NOT part of the url-path.

   The url-path syntax depends on the scheme being used, as does the
   manner in which it is interpreted.
```
@randomstuff 
> Something like name.system.uds.localhost and name.username.users.uds.localhost would work nicely...

Hmm - rhetorically, how does a hostname of the form which you suggest, having the top-level domain "localhost", get routed around the Internet?  I do not understand what you are describing there.  Can you be more specific?

You could use the "userinfo" component of the "authority", preceding the "@" delimiter, to customize the effect of the URI.  From the RFC, Section "3.2.1.  User Information":
```
   The userinfo subcomponent may consist of a user name and, optionally,
   scheme-specific information about how to gain authorization to access
   the resource.  The user information, if present, is followed by a
   commercial at-sign ("@") that delimits it from the host.
```
Did you mean something like that?

@yrro
> For comparison, OpenLDAP uses a different scheme for ldap-over-UNIX-sockets: ldapi://%2Fusr%2Flocal%2Fvar%2Fldapi connects to the socket at /usr/local/var/ldapi.

In terms of RFC 3986, that format also places the UDS socketpath into the URI "authority".  But, where is the "host" component?  The URI "host" is *required* by the RFC when any "authority" is given.  So this OpenLDAP ldapi URI scheme does not comply with RFC 3986, and neither is it registered with the IANA, as is.

But, same as above, "@ host" could be appended to that socketpath, to make it RFC compliant.  Still, then URI "userinfo" will have been lost.  Still, the socketpath could be added as a "password" for the "user", which is being deprecated anyway.  But, strictly speaking, the RFC says:
```
   ... Applications should not render as clear text any data
   after the first colon (":") character found within a userinfo
   subcomponent unless the data after the colon is the empty string
   (indicating no password).
```
So, the user could not visually discover or verify the actual UDS socketpath in a RFC compliant application using "socketpath as password".

@annevk
> We can't change the URL parser for each new use case that comes along. As Robin suggests above (and mnot explained earlier) there's plenty of room to innovate within the constraints of the existing syntax.

"Plenty of room to innovate" here seems like "code words" for "stuffing a square peg into a round hole."  And, please explain how "room to innovate" is something the does *not* also mean "change the URL parser"?

The classic method used to make an easy problem seem difficult and complicated is to have more parameters than abstract variables.  This then becomes a game of "musical chairs", and something always gets left-out, no matter how the parts are rearranged.  That's a "fool's errand".


-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/577#issuecomment-2605514357
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/577/2605514357@github.com>

Received on Tuesday, 21 January 2025 18:59:30 UTC