- From: Larry Masinter <masinter@adobe.com>
- Date: Sun, 31 May 2009 19:06:06 -0700
- To: "Roy T. Fielding" <fielding@gbiv.com>
- CC: HTML WG <public-html@w3.org>, "public-iri@w3.org" <public-iri@w3.org>
I've found it convenient to use "HRef" as a shorthand in the document. What I'm not sure of is whether I can get away with just *replacing* the IRI -> URI algorithm, or if I should leave both HRef -> URI and IRI -> URI. Right now, the HTML5/"Web Address" draft is written as "how to parse" and "how to resolve relative to absolute". I'm not sure if it's possible to recast it as HRef => URI, but it's certainly worth a try. Larry -- http://larry.masinter.net -----Original Message----- From: Roy T. Fielding [mailto:fielding@gbiv.com] Sent: Sunday, May 31, 2009 2:32 PM To: Larry Masinter Cc: HTML WG; public-iri@w3.org Subject: Re: Updating the IRI spec to include "web addresses" On May 31, 2009, at 10:12 AM, Larry Masinter wrote: > (Please reply on public-iri mailing list): > > I started working on trying to merge the "Web Address" concept into > the IRIbis document, using the text edited by Dan Connolly and M. > Sperberg-McQueen. > > The biggest question I see is that an IRI is defined as a sequence > of CHARACTERS which are independent of the ENCODING - whether > UTF-8, UTF-16, or shift-jis or something else. > > However, "Web Address" (or "Hypertext Reference", as has been > suggested) is defined as a sequence of BYTES which in turn have a > CHARACTER ENCODING which is taken from the DOCUMENT or SCRIPT in > which it is embedded. > > I don't think this is a difficulty, it's just an observation about > the layering. > > My intent is to use "Hypertext Reference" rather than "Web Address" > as the name of the concept being introduced, and to introduce a > "href" BNF. At this point, I'm planning on adding this as an > appendix, and I'm considering moving the LEIRI section to an > appendix as well. > > Any problems with this direction? Special things to be concerned > about? Is it the same direction as the following? The thing between the quotes in an HTML href/src/... attribute is called a hypertext reference. A hypertext reference is converted first into an infoset string in the document encoding (replacing entity references) and then into a URI reference (replacing the document encoding with some form of URI encoding). Both of those conversions are defined by HTML5. The latter is either done according to the IRI proposed standard or by some other character-replacement algorithm cooked up by HTML5. Once the attribute is in URI reference form, RFC3986 applies. The only thing called a Web Address is what RFC3986 defines as a URI. I suppose hypertext reference is easier to say than the more technically accurate designation of document-encoded resource identifier reference. ....Roy
Received on Monday, 1 June 2009 02:06:49 UTC