- From: Adam Barth <ietf@adambarth.com>
- Date: Sat, 4 Sep 2010 17:01:38 -0700
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: public-iri@w3.org, Peter Saint-Andre <stpeter@stpeter.im>
On Sat, Sep 4, 2010 at 4:28 PM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote: > * Adam Barth wrote: >>I've started by trying to separate the concerns of parsing absolute >>URLs and resolving relative URLs. We might come to find that such a >>distinction is foolish, but it seems plausible at this time. > > I don't think there is anything plausible about defining how to parse > an absolute reference that contains no colon and thus isn't absolute, > much like it is not plausible to define that the scheme in "#:" is "#". Plausible? I don't understand what you mean by that term. >>As for the parsing definition in RFC 3986 Appendix B, is this the >>regular expression that you're referring to? >> >> ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? >> >>This doesn't appear to get even simple examples correct. For example, >>that regular expression doesn't produce a match for the following >>string, but browsers do, in fact, behave as if this string represents >>a particular URL: >> >>http:///example.com/ > > That's a perfectly valid reference per the generic syntax and it has a > scheme of 'http', undefined query and fragment parts, an empty authority > and a path of '/example.com/' as mandated by RFC 3986 and as the regular > expression matches [1]. Unfortunately, Firefox, Chrome, and Safari interpret that string as if it were a URL with an authority of "example.com". > Neither IE6 nor Opera will treat the string as > if the third slash had been omitted; if any browser does, that is a bug. Rather, I'd say that there's an interoperability problem to solve, which is the motivation for this work. Now, how to resolve the difference in behavior is an interesting question. What matters in resolving this question, at least to browser vendors, is what existing content on the web expects browsers to do. That's a question we can answer with data, not with opinion. Do you have data to support which behavior, if implemented by a browser, would result in greater compatibility with existing web content? > That's one reason for my remark about the correctness of your algorithm. Thanks. If you have further examples of interesting input strings, that's appreciated. Blanket statements about "plausibility" are not appreciated. > [1] As the specification notes, the expression matches all strings Great. That's an important first step in defining behavior unambiguously, which, itself, is an important step in producing interoperable implementations. Adam
Received on Sunday, 5 September 2010 00:02:40 UTC