- From: Philip Jägenstedt <philipj@opera.com>
- Date: Sun, 30 Oct 2011 10:19:24 +0100
- To: public-html-data-tf@w3.org
On Sun, 30 Oct 2011 07:50:50 +0100, Jeni Tennison <jeni@jenitennison.com> wrote: > Henri, Ted, Philip, > > I wonder if you could help here. Do you know of examples where the HTML > URL resolution algorithm produces different results from the RFC-3987 > resolution algorithm? Is there a publicly available test suite that you > know of or a tool that you know does HTML URL resolution correctly that > could be used to generate accurate tests? > > Thanks, > > Jeni URL parsing [1] is modified to be more forgiving, e.g. it seems like the following would be invalid per RFC3986 but still parse using the modified rules: http://example.com/% http://example.com/## This is just a qualified guess. Python's urlparse still parses these just fine, so either Python also doesn't follow RFC3986 or I fail at reading specs. This is a willful violation, and was probably part of HTML WG ISSE-56, [2] so anyone who takes offense ought to look through that first. As for resolving, [3] I think the main difference is that the base URL can come from a <base> element. [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#parse-a-url [2] http://www.w3.org/html/wg/tracker/issues/56 [3] http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#resolving-urls -- Philip Jägenstedt Core Developer Opera Software
Received on Sunday, 30 October 2011 09:20:04 UTC