- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Tue, 08 Nov 2005 19:22:44 +0900
- To: JINMEI Tatuya / 神明達哉 <jinmei@isl.rdc.toshiba.co.jp>
- Cc: Bill Fenner <fenner@research.att.com>, ipv6@ietf.org, uri@w3.org, "Roy T. Fielding" <fielding@gbiv.com>
Hello Tatsuya, I think Roy Fielding has expressed the URI side of this story way more succinctly than I could ever do. I fully agree with him. Below a few additional points. At 11:17 05/11/08, JINMEI Tatuya / 神明達哉 wrote: >>>>>> On Mon, 07 Nov 2005 19:04:13 +0900, >>>>>> Martin Duerst <duerst@it.aoyama.ac.jp> said: > >>> It would be very confusing for the user to see they can simply reuse >>> the output of the diagnostic tool in some cases and they need to >>> convert the output in some other cases. > >> An additional idea would be to change some of the tools such as >> ping6 to accept and use '+' rather than '%'. Given the software >> counts for URI-processing software and IPv6 software, that's >> probably much easier than trying to force the non-escaping >> '%' into URI syntax (already a full standard). > >IMO (admitting YMMV), URI-processing software and IPv6 software are >both so deployed that we cannot simply make "this one is not fully >deployed so fixing this side should be easier". I indeed made a >similar argument about a year ago: >http://www1.ietf.org/mail-archive/web/ipv6/current/msg03987.html > >In addition, while I might buy this argument if the proposed syntax in >draft-fenner-... could avoid forcing special processing in >URI-processing software, it actually doesn't. The fact is that >"URI-processing software" will need modification anyway, whether we >adopt the draft-fenner-... syntax or just allow the RFC4007 format. Yes. But as Roy has explained, it's the effect of this syntax on URI-processing software that isn't updated that is the main concern. We can't expect a user to know which software is updated and which is not. >Meanwhile, requiring the existing tools that understand the RFC4007 >'%' format to support '+' effectively means deprecating the current >description of RFC4007 and updating the RFC itself, since this is >exactly the case when the proposed format defined in RFC4007 is >expected to be used. Well, it wouldn't be the first RFC to be updated. The URI spec was updated several times. And if zone ids in URIs are not an interoperability issue, then zone ids in other places shouldn't be an interoperability issue either. >On the other hand, I'm not sure whether the 'special processing' >required for the URI-processing software means requiring of the URI >standard itself. If we regard this as a user interface issue for >applications (see below), can't we regard the conversion from >"http://[fe80::abcd%fxp0]/" to "http://[fe80::abcd]/ within the >application as a "pre-processing before URI-processing", without >breaking the URI standard? (I'm afraid this 'wording trick' is >actually not acceptable by the URI community, but I'll see >anyway...) Well, There are indeed some processing steps that happen in that way. The best example I know is that it's possible to put a space e.g. in a src attribute in an <img> tag, and browsers will just convert that to %20. Similar in the address/location bar of a browser. But that's something that can happen on an uniform base, with any URI. What you are asking for would be much more special, and would require careful parsing. And it would mean that it has to be added to *every* URI processor, otherwise the '%' will confuse the further processing of the URI. But adding it to every processor isn't really possible, of course. Please note that '%' is the only character that has a special function in every part of an URI. If it's only about changing "http://[fe80::abcd%fxp0]/" to "http://[fe80::abcd]/", I don't see why the user can't do that. And how many users are there actually who will use raw ipv6 addresses with zone identifiers in URIs? I'm all for making it easy for users, if there's really lots of them. What we probably could do as a compromize would be to keep the 'gethostbyname' interface at using '%', if that is already strongly established (the new URI implementations can easily convert from an '+' in an URI to a '%' in a 'gethostbyname', in particular if our draft tells them do do that). On the other hand, visible notation would then move to use the '+' for overall user convenience. I haven't seen any inherent arguments against the '+' from your side, and I haven't overlooked anything. If this is true, then the situation is actually quite asymetric: RFC 4007 uses '%', but any other character would work as well (and libraries could easily accept more than one character, if that was necessary for backwards compatibility). On the other hand, RFC 3986 already uses '%' for something else, so that character is no longer available. Many others would work, indeed '+' is only one example, if you want another one, that might work, too. >> Also, this is not a matter of formality, it is a matter of >> deployment. What if something like "http://[v1.fe80::abcd%fxp0]/" >> suddenly gets converted into "http://[v1.fe80::abcd<0x0F>xp0]/" >> (<0x0F> standing for a 0F byte, which is Shift In). > >This would be bad, of course. But I don't think that matter much >because "http://[v1.fe80::abcd+fxp0]/" doesn't work either with >today's URI parsers. If "doesn't work" means "maybe doesn't resolve", then yes. This is true even for something like http://www.ietf.org. The network is never perfect. But as Roy described, for '%', we are looking at a much more varied, and trubling, pattern of failure. Same argument for various schemes: No URI resolver is required to understand all schemes (how could it). Regards, Martin.
Received on Tuesday, 8 November 2005 10:26:45 UTC