- From: Peter Saint-Andre <stpeter@stpeter.im>
- Date: Fri, 29 Jun 2012 12:27:45 -0600
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
- CC: "public-iri@w3.org" <public-iri@w3.org>
Hi Martin, thanks for the clarification. I have a few comments inline. On 6/25/12 3:22 AM, "Martin J. Dürst" wrote: > Hello Peter, > > I think Björn already gave very good answers to your questions. > > On 2012/06/22 3:28, Peter Saint-Andre wrote: >> <hat type='individual'/> >> >> I've been thinking about IRIs, and I'm wondering: why would a protocol >> "upgrade" from URIs to IRIs? > > As Björn said, it's really more about new protocols than about upgrades. > Also, different protocols (and formats) can upgrade in different ways. > Sometimes, this can be done formally with extensions, at other times > it's done gradually and sooner or later gets accepted in a spec. For > other cases, of course, it may never happen. > >> (If it really is an "upgrade" -- a topic >> for another time.) >> >> Consider HTTP. It has always used URIs for retrieving documents and >> linking and such. > > [There are some reports of clients just sending UTF-8, which I think > would mean using IRIs. But that has never reached the spec.] Do you think it should reach the spec? >> Why would it change to use IRIs? Section 1.2 of >> 3987bis describes some necessary conditions for such a change, but >> doesn't really motivate why the HTTP community would want to do so. Yes, >> there is text in Section 1.1 about representing the words of natural >> languages, but URIs can be used to represent those words right now. I >> grant that the current mechanism for such representation isn't pretty, >> but do the addressing elements of a protocol like HTTP need to be >> pretty, or can we simply depend on the presentation software (e.g., web >> browsers) to make things look nice for the user? > > I think the real motivation would be people looking at HTTP traces and > preferring to see Unicode rather than lots of %HH strings. Of course the > number of people looking at HTTP traces is low, and they are not end users. > > In general, the motivation to use IRIs is highest closer to end users > and content-oriented people such as document authors, and gets lower the > lower one gets in the protocol stack. It seems to me that end users can be shielded from what you call "this weird %HH stuff" (after all, we don't show them "this weird angle-bracket stuff" either), but what you say about document authors and operations people makes sense. Perhaps it would be good to capture that in the spec. > Another motivation may be compression. > http://ja.wikipedia.org/wiki/青山学院大 is quite a bit shorter than > http://ja.wikipedia.org/wiki/%E9%9D%92%E5%B1%B1%E5%AD%A6%E9%99%A2%E5%A4%A7%E5%AD%A6. > So maybe we can sell that to HTTP 2.0. But I'm somewhat skeptical. Only > a tiny bit of creative thinking would have been needed to transition > various header fields in HTTP from the hopelessly outdated iso-8859-1 > (Latin-1) to UTF-8, but it didn't happen :-(. > > The best motivation would be streamlining. EAI does a lot of > streamlining for e-mail; if it weren't for all the legacy baggage, it > would be a joy to implement. For HTTP, if browsers use Unicode > internally, and servers use it internally, what's the need for this > weird %HH stuff anyway? (It's still needed to escape reserved > characters, though.) > > >> (Certainly we do that >> with structural elements like the HTML document format, why not also >> with addressing elements like URIs?) I realize that these questions get >> back to the matter of "protocol element" vs. "presentation", but I guess >> what I'm saying is that I don't yet think we've really explained why we >> need to make IRIs a first-class protocol element (or why a given >> protocol would want to make the switch from URI-only to IRI). >> >> Furthermore, 3987bis doesn't really explain what would be involved in >> the change from URI-only to IRI in any given protocol. I suppose spec >> writers in a technology community like HTTP would need to figure it out, >> but IMHO some guidelines would be helpful. > > As I said at the start of this mail, I think it depends a lot on the > specific protocol. The conditions we give in Section 1.2 are general > considerations that apply to any protocol/format. Protocol-specific > considerations should do the rest, and I'm not sure it makes sense to > write much about this. > > But when looking at Section 1.2, I realized that the first sentence > might have been the motivation for your mail. This sentence says: > IRIs are designed to allow protocols and software that deal with URIs > to be updated to handle IRIs. > I think that this puts too much emphasis on "update", but I'm not yet > sure how to fix that. Well, "update" is not "upgrade", so perhaps I have read too much into the text. However, I think we could change it to read: IRIs are designed to allow protocols and software that deal with URIs to also handle IRIs if desired. Peter -- Peter Saint-Andre https://stpeter.im/
Received on Friday, 29 June 2012 18:28:13 UTC