W3C home > Mailing lists > Public > public-html@w3.org > May 2009

Re: Updating the IRI spec to include "web addresses"

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sun, 31 May 2009 14:32:25 -0700
Message-Id: <EDE54C70-6881-436E-95A3-4DBF3966740E@gbiv.com>
Cc: HTML WG <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
To: Larry Masinter <masinter@adobe.com>
On May 31, 2009, at 10:12 AM, Larry Masinter wrote:

> (Please reply on public-iri mailing list):
>
> I started working on trying to merge the “Web Address” concept into  
> the IRIbis document, using the text edited by Dan Connolly and M.  
> Sperberg-McQueen.
>
> The biggest question I see is that an IRI is defined as a sequence  
> of CHARACTERS which are independent of the ENCODING – whether  
> UTF-8, UTF-16, or shift-jis or something else.
>
> However, “Web Address” (or “Hypertext Reference”, as has been  
> suggested) is defined as a sequence of BYTES which in turn have a  
> CHARACTER ENCODING which is taken from the DOCUMENT or SCRIPT  in  
> which it is embedded.
>
> I don’t think this is a difficulty, it’s just an observation about  
> the layering.
>
> My intent is to use “Hypertext Reference” rather than “Web Address”  
> as the name of the concept being introduced, and to introduce a  
> “href” BNF.  At this point, I’m planning on adding this as an  
> appendix, and I’m considering moving the LEIRI section to an  
> appendix as well.
>
> Any problems with this direction? Special things to be concerned  
> about?

Is it the same direction as the following?

The thing between the quotes in an HTML href/src/... attribute is called
a hypertext reference.  A hypertext reference is converted first into an
infoset string in the document encoding (replacing entity references)
and then into a URI reference (replacing the document encoding with
some form of URI encoding).  Both of those conversions are defined by  
HTML5.
The latter is either done according to the IRI proposed standard or by
some other character-replacement algorithm cooked up by HTML5.
Once the attribute is in URI reference form, RFC3986 applies.  The only
thing called a Web Address is what RFC3986 defines as a URI.

I suppose hypertext reference is easier to say than the more technically
accurate designation of document-encoded resource identifier reference.

....Roy
Received on Sunday, 31 May 2009 21:32:50 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:03 UTC