W3C home > Mailing lists > Public > public-html-data-tf@w3.org > November 2011

Re: Parsing Microdata into RDF Graphs: URI Comparison

From: Adam Barth <w3c@adambarth.com>
Date: Wed, 16 Nov 2011 10:42:56 -0800
Message-ID: <CAJE5ia8yPmTvQhV1PVV4k_tL7Kwuyv9T2_wLmzbfPnvLwUQb8w@mail.gmail.com>
To: Henri Sivonen <hsivonen@iki.fi>
Cc: public-html-data-tf@w3.org
On Wed, Nov 16, 2011 at 5:27 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Sun, Oct 30, 2011 at 8:50 AM, Jeni Tennison <jeni@jenitennison.com> wrote:
>> I wonder if you could help here. Do you know of examples where the HTML URL resolution algorithm produces different results from the RFC-3987 resolution algorithm?
>
> The RFC's algorithm doesn't consider the encoding of the document the
> URL is in as an input to the algorithm. The HTML algorithm does. So if
> the URL being resolved contains non-ASCII characters and the context
> of the URL was not UTF-8 or UTF-16-encoded, the results between the
> HTML algorithm and the RFC algorithm would differ.
>
> There might be other differences around edge cases that the RFC
> considers invalid.
>
>> Is there a publicly available test suite that you know of or a tool that you know does HTML URL resolution correctly that could be used to generate accurate tests?
>
> I don't know. Adam Barth (CCed) might know.

Here's one test suite that you can run in browsers:

http://trac.webkit.org/browser/trunk/LayoutTests/fast/url/

Adam
Received on Wednesday, 16 November 2011 18:44:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 16 November 2011 18:44:12 GMT