W3C home > Mailing lists > Public > public-html@w3.org > April 2010

Re: Change definition of URL to normatively reference IRI specification using a well-defined interface

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 9 Apr 2010 09:10:39 +0000 (UTC)
To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Cc: Ted Hardie <ted.ietf@gmail.com>, Maciej Stachowiak <mjs@apple.com>, Larry Masinter <LMM@acm.org>, Julian Reschke <julian.reschke@gmx.de>, Marc Blanchet <Marc.Blanchet@viagenie.ca>, Sam Ruby <rubys@intertwingly.net>, Paul Cotton <Paul.Cotton@microsoft.com>, Michel SUIGNARD <Michel@suignard.com>, public-html <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
Message-ID: <Pine.LNX.4.64.1004090845180.10192@ps20323.dreamhostps.com>
On Fri, 9 Apr 2010, "Martin J. Dürst" wrote:
> > 
> > Issue 1:
> > ========================================================================
> > Update the IRI specification to define an algorithm with the following
> > characteristics:
> 
> In order to make it easier to understand this for people who are not deeply
> involved in the HTML5 effort, I'd like to confirm that this is the algorithm
> that HTML5 uses to split an URI/IRI into various components, each of which is
> then accessible via a (Javascript) DOM API function. So I guess the title of
> our issue should be something like:
> "Ensure that the IRI spec defines how to split an IRI into components in a way
> that's referencable by the HTML5 spec" or some such.

Right.


> > Exactly what this algorithm must do is a matter that will need careful 
> > research, reverse-engineering existing UAs.
> 
> My understanding was that a lot of this research had already been done, 
> and that we would basically try to match whatever was in the HTML5 spec 
> before Dan Connolly and Michael Sperberg-McQueen extracted it into a 
> separate draft. Of course, we should always be open to new information 
> coming up, but your sentence above sounds much more like we have to 
> start anew. Can you clarify?

Since the text was written, so many problems have been shown to exist with 
the existing text that frankly I think it would be significantly less work 
to just start over and reverse-engineer the algorithms from scratch than 
to try to first attempt to match what HTML5 used to say and then verify it 
for correctness.

(Personally, if the working groups were to decide that HTML5 is where 
these algorithms should be, I'd probably just throw out the old text and 
start again from nothing, working closely with the relevant engineers at 
the various major browser vendors to check what they consider important 
and what don't, trying to reconcile the various behaviours with each 
other, with legacy content requiremnts, and with the intent of the URI and 
IRI specs. I almost certainly wouldn't start from the old algorithms.)


> > Issue 2:
> > ========================================================================
> > Update the IRI specification to define an algorithm with the following
> > characteristics:
> 
> Again to clarify here, if I understand correctly, the HTML5 spec needs 
> such an algorithm to resolve relative references with respect to a base 
> URI

Right. This algorithm is used for resolving URLs relative to a base URL, 
and also to convert URLs into a more canonical (if not always valid) form.


> (my wild guess is that B is the base, and A is the relative URI below, 
> can you confirm)?

Right. Of course, A need not be relative, it could be itself an absolute 
URL, or it could be something unparseable.

HTH,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 9 April 2010 09:11:11 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:16 UTC