- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Sun, 28 Oct 2012 13:51:38 -0400
- To: Anne van Kesteren <annevk@annevk.nl>
- Cc: whatwg@lists.whatwg.org
On 10/27/12 3:35 PM, Anne van Kesteren wrote: > This is covered as we do this for all URLs currently with a "relative > scheme" (http/ws/...). I know you indicated this as potentially > problematic Let's have that fight separately. ;) >> 2) file:// URIs are parsed as a "no authority" URL in Gecko. Quoting the >> IDL comment: ... > The parser in the specification should handle these in the same way. Same as the comment I quoted? As same as something else? > I have not introduced a "no authority" concept however. The parser in > the specification also preserves the host as other user agents seem to > preserve it. Well, the Gecko parser preserves the host at this stage assuming the URI was correctly formatted with a host. Again: blah://foo/bar => blah://foo/bar The interesting things happen when you have 0, 1, or 3 slashes between ':' and "foo". The handling of "foo" after this point is a separate issue. >> 4) For "no authority" URLs, including file://, on Windows and OS/2 only, if >> what looks like authority section looks like a drive letter, it's treated as >> part of the path. For example, "file://c:/" is treated as the filename >> "c:\". "Looks like a drive letter" is defined as "ASCII letter (any case), >> followed by a ':' or '|' and then followed by end of string or '/' or '\\'". >> I'm not sure why this is checking for '\\' again, honestly. ;) > > Is this part of URL parsing or part of doing something with the > resulting URL? In Gecko, it's part of URL parsing. More precisely, it's part of the normalization performed as part of constructing a "URL" object from a string. Since this is also how we parse URLs, it's effectively all part of the package. But note that it would be a bit odd of file://c:/ claimed to have a host of "c" with a default port or some such... >> 5) When parsing a "no authority" URL (including file://), and when item 4 >> above does not apply, it looks like Gecko skips everything after "file://" >> up until the next '/', '?', or '#' char before parsing path stuff. > > So the host is dropped? In Gecko, I believe so, yes. I'm not saying this is desirable; just what Gecko does. >> 6) On Windows and OS/2, when dynamically parsing a path for a "no >> authority" URL (not sure whether this is actually web-exposed, fwiw...) >> Gecko will do something involving looking for a path that's only an ASCII >> letter followed by ':' or '|' followed by end of string. ... >> 7) When doing URI equality comparisons ... >> 8) When actually resolving a file:// URL > These points do not seem to be about parsing, correct? Well, point 6 is about parsing, sort of. 7 and 8 are not, though at some point we'll need to define equality comparisons anyway. -Boris
Received on Sunday, 28 October 2012 17:52:22 UTC