- From: John C Klensin <john-ietf@jck.com>
- Date: Sun, 01 Jul 2012 21:03:48 -0400
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Peter Saint-Andre <stpeter@stpeter.im>
- cc: public-iri@w3.org
(sorry - sent from wrong address) --On Monday, June 25, 2012 18:22 +0900 "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp> wrote: > Hello Peter, > > I think Björn already gave very good answers to your > questions. Martin, Björn, Peter, > On 2012/06/22 3:28, Peter Saint-Andre wrote: >> <hat type='individual'/> >> >> I've been thinking about IRIs, and I'm wondering: why would a >> protocol "upgrade" from URIs to IRIs? > > As Björn said, it's really more about new protocols than > about upgrades. Also, different protocols (and formats) can > upgrade in different ways. Sometimes, this can be done > formally with extensions, at other times it's done gradually > and sooner or later gets accepted in a spec. For other cases, > of course, it may never happen. >... For whatever it is worth, I don't find that answer particularly helpful. My problem with it is one that we have discussed pieces of before. If the requirement were to make something that was coupled closely enough to URIs to be a UI overlay, then we have one set of issues. The WG has moved beyond that into precisely what you are commenting on above and that they key draft seems to reflect -- a new protocol element to be used primarily in new, or radically updated/upgraded, protocols. But, if we are going to define a new protocol element for new uses, then why stick with the basic URI syntax framework? We already know that causes problems. It is hard to localize because it contains a lot of ASCII characters that are special sometimes and not others, that may have non-Latin-script lookalikes, and because parsing is method-dependent. That method-dependency makes it very hard to create variations that are appropriate to the local writing system because one has to be method-sensitive at too many different points. If some protocols are to permit only IRIs, some only URIs, and some both, it would also be beneficial to be able to determine which is which, rather than wondering whether an IRI that actually contains only ASCII characters (and no escapes) is actually an IRI or is just the URI it looks like. Again, as long as IRIs were just an UI overlap, it made no difference. But, as a protocol element. I continue to believe that makes a strong case for doing something that gets us internationalization by moving away from the URI syntax model, probably to something that explicitly identifies the data elements that make up a particular URI. If, for example, one insisted that domain names be identified as such wherever they appear, the mess about whether something can or should be given IDNA treatment (even if only to verify U-label syntax) and the associated RFC 6055 considerations become much easier to handle than if one can to guess whether something might be a domain name or something else with periods in it. Stated a little differently, if IRIs are protocol elements that are intended to support new protocols, then it seems to me that it is not obvious that the URI syntax is a constraint. Certainly the WG has not had a serious discussion about what the advantages of that constraint are and whether they outweigh the disadvantages. best, john
Received on Monday, 2 July 2012 01:04:24 UTC