Re: [whatwg/url] Addressing HTTP servers over Unix domain sockets (#577)

Hmm - reading back through this all:

@mnot
> What's different here is that unix domain sockets have a completely different authority ...

As I discussed above, reading in RFC 3986, I have argued that the Unix Domain Socket, UDS, must be "the 'port' subcomponent of authority of an Address Family AF_UNIX socket" within the meaning of RFC 3986.  In reference to that, I have no idea what you mean by "unix domain sockets have a completely different authority".  Please explain.  Are you arguing that the UDS is *not* an "authority"?  If so, how so?  And, what do you mean by "have" an authority, rather than *being* an authority - and, "different" from what?

The simplest revision to RFC 3986, as I have suggested here, is to generalize the definition of "port" to include a UDS filesystem path, rather than be restricted to being exclusively an AF_INET number, and to also extend the use of the port ":" delimiter, as a kind of "toggle" delimiter using also an optional trailing ":", as in ":some/path:" or ":/some/path:", with the optional trailing ":" *explicitly* preceding the "path" component of the "authority" component of the "hier-part" of the URI in RFC 3986.  Thus revising the original:

```
 URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

 hier-part   = "//" authority path-abempty
             / path-absolute
             / path-rootless
             / path-empty
```
```
authority   = [ userinfo "@" ] host [ ":" port ]
```
```
port        = *DIGIT
```
to express the URI in more plain and explicit format:
```
 URI         = scheme ":" authority path [ "?" query ] [ "#" fragment ]
```
where "path" is *already* defined explicitly and separately in Section "3.3. Path":
```
      path          = path-abempty    ; begins with "/" or is empty
                    / path-absolute   ; begins with "/" but not "//"
                    / path-noscheme   ; begins with a non-colon segment
                    / path-rootless   ; begins with a segment
                    / path-empty      ; zero characters
...
```
and extend the definition of the authority, only changing the description, and not the actual definition, to become instead:
```
The authority component is preceded by a double slash ("//") and is
terminated by the next question mark ("?") or number sign ("#")
character,  or by the next slash ("/") character after any socketpath,
or by the end of the URI.

authority   = [ userinfo "@" ] host [ ":" port ]
```
and extend the definition of "port", keeping in mind that "path" is already defined in the RFC, to become instead:
```
port        = *DIGIT / socketpath
socketpath  =  path ":"
```
Note that this approach still allows the use of a ":" in a "hier-part path", distinct from the "hier-part authority", though it would prohibit the use of a ":" in the socketpath itself: "https://example.com:socket/path:/some:path/with/a:colon".  The second ":" in the authority terminates the socketpath.  Of course, that prohibition would be true for any delimiter that is used for the socketpath.  Still, a literal "::" would terminate any "hier-part authority", without actually specifying a port.

Incidentally, from Section "3.3. Path":
```
   If a URI contains an authority component, then the path component
   must either be empty or begin with a slash ("/") character.  If a URI
   does not contain an authority component, then the path cannot begin
   with two slash characters ("//").  In addition, a URI reference
   (Section 4.1) may be a relative-path reference, in which case the
   first path segment cannot contain a colon (":") character.
```
I also note that, following my rant about the "://", reference to 'a double slash ("//")' in RFC 3986 Section "3.2. Authority" must not be confused with Section "4.2. Relative Reference", referring to a URI "hier-part path", *following* a URI "hier-part authority":
```
   A relative reference takes advantage of the hierarchical syntax
   (Section 1.2.3) to express a URI reference relative to the name space
   of another hierarchical URI.
...
   A relative reference that begins with two slash characters is termed
   a network-path reference; such references are rarely used.  A
   relative reference that begins with a single slash character is
   termed an absolute-path reference.  A relative reference that does
   not begin with a slash character is termed a relative-path reference.
```
This "relative reference" idea just seems to me to be a needless complication of the URI.  I don't know that I've ever seen anyone actually use a "relative reference".  And, while most browsers seem to accept a URL with a missing "scheme" and elided "authority", which is to say, the "https://" prefix deleted, that is *not* an example of a "relative reference" as defined in the RFC.

It would be useful for people to please argue whether this "socketpath as port" interpretation of the UDS, as outlined here, is, or is not: effective, "minimally invasive" to existing standards, and/or not counterintuitive.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/577#issuecomment-2600336950
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/577/2600336950@github.com>

Received on Sunday, 19 January 2025 00:24:01 UTC