- From: sleevi <notifications@github.com>
- Date: Wed, 04 Jan 2017 11:24:12 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/97/270461963@github.com>
@annevk File URLs & ports/authority has a storied history in Chrome... attached are my notes from the last time I dug into this, to at least hopefully explain the behaviour to figure out where to align: ### Can "file" have a host? RFC 3986 Section 3.2.2 notes (in passing) that > For example, the "file" URI scheme is defined so that no authority, an empty host, and "localhost" all mean the end-user's machine, whereas the "http" scheme considers a missing authority or empty host invalid. If we dig back to RFC 1630, Page 18: > There is clearly a danger of confusion that a link made to a local file should be followed by someone on a different system, with unexpected and possibly harmful results. Therefore, the convention is that even a "file" URL is provided with a host part. This allows a client on another system to know that it cannot access the file system, or perhaps to use some other local mecahnism to access the file. and > A void host field is equivalent to "localhost". ### Can "file" have a port? RFC 1738, Section 3.10, which updates RFC 1630 (and became the basis for RFC 3986) notes the file scheme as: > A file URL takes the form: file://<host>/<path> Unlike other schemes (such as prospero or wais), which explicitly list the `:<port>` construction in their ABNF, file:// lacks this. So to what Chrome's behaviour is: - When canonicalizing a `file://` URL, our canonicalizer constructs it in the form of `file://<host>/<path>?<query>#<ref>` ( https://cs.chromium.org/chromium/src/url/url_canon_fileurl.cc?rcl=0&l=88 ), meaning ports (and the colon) are always omitted when reserializing a file URL (and if host is empty, an empty authority component, resulting in the expected `file:///` triple-slash) - When parsing a `file://` URL, setting aside the 'windows' special logic (and the UNC path logic), our parser always ignores the port ( https://cs.chromium.org/chromium/src/url/url_parse_file.cc?rcl=1483532378&l=45 ) - `file:` schemed URLs always result in a PORT_UNSPECIFIED for the effective port, meaning it should not end up serialized To your question about what's the right behaviour: I suspect failing on `:` would probably be ideal, but I wouldn't be in a place to change anytime soon, simply because I don't have the time to own any fallout/regressions that it might cause (however unlikely). It might be one of my colleagues can own this, if it's believed to be important for compat. Not allowing port is definitely a good thing (... and seems like it'd require no work on Chrome's side, since that's what we do). Allowing ports on file URLs seems to have the largest back-compat issues, at least re: spec precedent - it's seemingly long been forbidden - and it's also something probably unlikely for Chrome, if only because that would require a lot more monkey-ing about with the UNC & drive-letter sniffing logic. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/97#issuecomment-270461963
Received on Wednesday, 4 January 2017 19:24:47 UTC