- From: Larry Masinter <masinter@adobe.com>
- Date: Mon, 28 Dec 2009 12:04:42 -0800
- To: "julian.reschke@gmx.de" <julian.reschke@gmx.de>
- CC: "public-iri@w3.org" <public-iri@w3.org>
(ref http://www.w3.org/Bugs/Public/show_bug.cgi?id=8207 ) Note: http://lists.w3.org/Archives/Public/public-html/2009Nov/att-0670/iri-rewrite-draft.html contains a proposed rewrite of the HTML 5 specification's section 2.5.1, which would remove all references to the previous "WEBADDRESSES" specification (http://www.w3.org/html/wg/href/draft) and use draft-duerst-iri-bis-07 as the normative reference instead. The draft text includes several definitions which may belong in the IRI document itself (including how to resolve an arbitrary string against an absolute base), which may be necessary if there are other specifications which were planning to use [WEBADDRESSES] as a normative reference. Larry: >> I'd appreciate it if some other mailing list subscribers had some >> ideas for how to fix the document better to accomplish (1) while retaining >> the goal for (2). To make progress on (2), I think we'd want to take >> some of the things in section 7.2 "HREF preprocessing" and move them >> into the main body of what all normative URI processors should do, and Julian: > URI processors or IRI processors? To be careful ,"new IRI processors". One of the problems we have discussing this topic is that if we are considering changing the meaning of "IRI processor", that of course some previously conforming processors will become non-conforming. Larry: >> not just the ones in browsers. Things like chopping off initial & final >> whitespace, hadling single "%" , deleting or encoding otherwise illegal >> characters, etc. >> ... Julian: > Would that affect the definition of a syntactically legal IRI? Yes; that's the point, isn't it? Bring "syntactically legal IRI" into alignment with widespread, popular implementations. The implementations in Windows, OS X, Firefox, WebKit, etc. are widespread and popular. Doing so will certainly involve, for the most part, changing the definition of "syntactically legal IRIs". There may be a few issues where some "widespread, popular implementations" may also need change (if implementations exhibit different behavior, to bring them into alignment). > The reason why I'm asking is that there are specifications that rely on > the fact that the space character can't be part of a legal IRI (or URI), > and thus can be used as delimiter (the same probably applies to other > kinds of whitespace). There are a number of legacy specifications which refer to other forms (e.g., LEIRI). The path I think we should follow is to leave "URI" completely alone (long-standing, stable, STANDARD) but let the term "IRI" expand to encompass as much as feasible, and then handle other variations as syntactic restrictions. I think that's preferable to continuing to define "IRI" narrowly and then treating variations as syntactic expansions. draft-deurst-iri-bis-07 doesn't go this far -- it mainly leaves IRI alone and treats "popular browser implementations" as a syntactic expansion which is handled by pre-processing. To fix this, we would "take some of the things in section 7.2 "HREF preprocessing" and move them into the main body of what all normative URI[IRI] processors should do". Larry -- http://larry.masinter.net
Received on Monday, 28 December 2009 20:05:07 UTC