- From: sleevi <notifications@github.com>
- Date: Wed, 04 Jan 2017 11:24:12 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/97/270461963@github.com>
@annevk File URLs & ports/authority has a storied history in Chrome... attached are my notes from the last time I dug into this, to at least hopefully explain the behaviour to figure out where to align:
### Can "file" have a host?
RFC 3986 Section 3.2.2 notes (in passing) that
> For example, the "file" URI
scheme is defined so that no authority, an empty host, and
"localhost" all mean the end-user's machine, whereas the "http"
scheme considers a missing authority or empty host invalid.
If we dig back to RFC 1630, Page 18:
> There is clearly a danger of confusion that a link made to a local
file should be followed by someone on a different system, with
unexpected and possibly harmful results. Therefore, the convention
is that even a "file" URL is provided with a host part. This allows
a client on another system to know that it cannot access the file
system, or perhaps to use some other local mecahnism to access the
file.
and
> A void host field is equivalent to "localhost".
### Can "file" have a port?
RFC 1738, Section 3.10, which updates RFC 1630 (and became the basis for RFC 3986) notes the file scheme as:
> A file URL takes the form:
file://<host>/<path>
Unlike other schemes (such as prospero or wais), which explicitly list the `:<port>` construction in their ABNF, file:// lacks this.
So to what Chrome's behaviour is:
- When canonicalizing a `file://` URL, our canonicalizer constructs it in the form of `file://<host>/<path>?<query>#<ref>` ( https://cs.chromium.org/chromium/src/url/url_canon_fileurl.cc?rcl=0&l=88 ), meaning ports (and the colon) are always omitted when reserializing a file URL (and if host is empty, an empty authority component, resulting in the expected `file:///` triple-slash)
- When parsing a `file://` URL, setting aside the 'windows' special logic (and the UNC path logic), our parser always ignores the port ( https://cs.chromium.org/chromium/src/url/url_parse_file.cc?rcl=1483532378&l=45 )
- `file:` schemed URLs always result in a PORT_UNSPECIFIED for the effective port, meaning it should not end up serialized
To your question about what's the right behaviour: I suspect failing on `:` would probably be ideal, but I wouldn't be in a place to change anytime soon, simply because I don't have the time to own any fallout/regressions that it might cause (however unlikely). It might be one of my colleagues can own this, if it's believed to be important for compat. Not allowing port is definitely a good thing (... and seems like it'd require no work on Chrome's side, since that's what we do).
Allowing ports on file URLs seems to have the largest back-compat issues, at least re: spec precedent - it's seemingly long been forbidden - and it's also something probably unlikely for Chrome, if only because that would require a lot more monkey-ing about with the UNC & drive-letter sniffing logic.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/97#issuecomment-270461963
Received on Wednesday, 4 January 2017 19:24:47 UTC