Larry Masinter wrote: > I've found it convenient to use "HRef" as a shorthand > in the document. > > What I'm not sure of is whether I can get away with > just *replacing* the IRI -> URI algorithm, or if > I should leave both HRef -> URI and IRI -> URI. I think the IRI -> URI algorithm should not change (expect for the bit about normalization discussed previously). What should be added is HRef -> IRI (whch implies that in some cases, that mapping would need to map query parameters to plain ASCII). LEIRIs then could become a special case of the thing described above. > Right now, the HTML5/"Web Address" draft is written as > "how to parse" and "how to resolve relative to absolute". > > I'm not sure if it's possible to recast it as > HRef => URI, but it's certainly worth a try. Repeating what I suggested on www-tag a few days ago (<http://lists.w3.org/Archives/Public/www-tag/2009May/0083.html>)...: This has been under discussion for something like nine months. I think the issues, as documented by Ian, Henri and now by Dan are well-understood (and thanks for posting examples and test cases). I think when we discussed this last October, Larry and several others (including myself...) pointed out that the additional complexity as compared to IRIs (RFC3987) can easily be layered *above* IRI, mapping HTML5-references to IRIs by just by stating: 1) non-IRI characters found in the query part are encoded using the document's character encoding, then percent-escaped (*) 2) all other non-IRI characters (such as space) are encoded using UTF-8, then percent-escaped Or, if we use LEIRIs as foundation instead (<http://tools.ietf.org/html/draft-duerst-iri-bis-04#section-7>), we end up with a *single* rule: 1') non-IRI characters found in the query part are encoded using the document's character set, then percent-escaped (*) Why does it need to be *more* complex than that? BR, JulianReceived on Monday, 1 June 2009 14:14:47 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 06:33:36 GMT