W3C home > Mailing lists > Public > public-html-data-tf@w3.org > November 2011

Re: Parsing Microdata into RDF Graphs: URI Comparison

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 16 Nov 2011 15:27:44 +0200
Message-ID: <CAJQvAudbB55EDCGXF5anHdXeSdLKv06rfBs=7DzxdYXi7kH5WA@mail.gmail.com>
To: public-html-data-tf@w3.org
Cc: w3c@adambarth.com
On Sun, Oct 30, 2011 at 8:50 AM, Jeni Tennison <jeni@jenitennison.com> wrote:
> I wonder if you could help here. Do you know of examples where the HTML URL resolution algorithm produces different results from the RFC-3987 resolution algorithm?

The RFC's algorithm doesn't consider the encoding of the document the
URL is in as an input to the algorithm. The HTML algorithm does. So if
the URL being resolved contains non-ASCII characters and the context
of the URL was not UTF-8 or UTF-16-encoded, the results between the
HTML algorithm and the RFC algorithm would differ.

There might be other differences around edge cases that the RFC
considers invalid.

> Is there a publicly available test suite that you know of or a tool that you know does HTML URL resolution correctly that could be used to generate accurate tests?

I don't know. Adam Barth (CCed) might know.

Henri Sivonen
Received on Wednesday, 16 November 2011 13:28:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:08:25 UTC