W3C home > Mailing lists > Public > public-html@w3.org > July 2009

Re: FW: New Version Notification for draft-duerst-iri-bis-06

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 28 Jul 2009 21:25:12 +0000 (UTC)
To: Larry Masinter <masinter@adobe.com>
Cc: "public-iri@w3.org" <public-iri@w3.org>, public-html@w3.org
Message-ID: <Pine.LNX.4.62.0907281952410.3189@hixie.dreamhostps.com>
On Mon, 13 Jul 2009, Larry Masinter wrote:
>
> This version attempts to integrate the "web address" concept (called 
> Hypertext Reference or HREF) into the main IRI specification.  The text 
> has gone through sufficient transformations that I don't have confidence 
> in its accuracy, but at least it indicates to me that the many specs are 
> mergable.

   http://tools.ietf.org/html/draft-duerst-iri-bis

It appears that in the process of merging this:

   http://www.w3.org/html/wg/href/draft-ietf.html

...into the above ID, the following key parts were lost:

 * The definition of what is a "valid URL", defined such that an IRI is 
   only a "valid URL" if its query part will be interpreted according to 
   the IRI spec according to the URL parsing algorithm.

 * The definition of what is an "absolute URL", defined such that even 
   invalid strings can be valid URLs. For example, the following:

       http://example.com/%

   ...is not a "valid URL" but needs to be an "absolute URL".

 * The definition of how to determine whether the following components are 
   present in, and how to obtain their value from, a string that may not 
   be a "valid URL":

     <scheme>
     <host>
     <port>
     <hostport>
     <path>
     <query>
     <fragment>
     <host-specific>

 * The definitions for how to resolve a string to an "absolute URL" when 
   the original string is not necessarily a "valid URL".

(I use the term "URL" here in the HTML5 sense, which has varyingly been 
called a Web Address or an HRef in related work.)


The following issues also exist in the draft:

 * "is the script's character. encoding" has a typo (misplaced ".")

 * Step 8 in the algorithm for parsing HRefs appears to be a corrupted 
   form of the definition of <hostport> from the old HTML5 text. The new 
   text as phrased appears to be meaningless.

 * It appears that the parsing algorithm is destructive, in that the 
   results will not be isomorphic with the input. For example, the 
   following:

      http://example.com/##

   ...will turn into:

      http://example.com/#%23

   ...which, once the "resolving" algorithm is reintroduced, will be 
   incompatible with implemented practice.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 28 July 2009 21:26:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:42 GMT