Re: Fwd: Parsing Microdata into RDF Graphs: URI Comparison

On Sun, 30 Oct 2011 07:50:50 +0100, Jeni Tennison <jeni@jenitennison.com>  
wrote:

> Henri, Ted, Philip,
>
> I wonder if you could help here. Do you know of examples where the HTML  
> URL resolution algorithm produces different results from the RFC-3987  
> resolution algorithm? Is there a publicly available test suite that you  
> know of or a tool that you know does HTML URL resolution correctly that  
> could be used to generate accurate tests?
>
> Thanks,
>
> Jeni

URL parsing [1] is modified to be more forgiving, e.g. it seems like the  
following would be invalid per RFC3986 but still parse using the modified  
rules:

http://example.com/%
http://example.com/##

This is just a qualified guess. Python's urlparse still parses these just  
fine, so either Python also doesn't follow RFC3986 or I fail at reading  
specs. This is a willful violation, and was probably part of HTML WG  
ISSE-56, [2] so anyone who takes offense ought to look through that first.

As for resolving, [3] I think the main difference is that the base URL can  
come from a <base> element.

[1]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#parse-a-url
[2] http://www.w3.org/html/wg/tracker/issues/56
[3]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#resolving-urls

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Sunday, 30 October 2011 09:20:04 UTC