- From: Martin Duerst <duerst@w3.org>
- Date: Sun, 05 Sep 2004 10:15:51 +0900
- To: Larry Masinter <LMM@acm.org>, uri@w3.org
- Cc: Tim Kindberg <timothy@hplb.hpl.hp.com>
Hello Larry, Many thanks for your comments, which I think are quite to the point. At 10:52 04/09/04 -0700, Larry Masinter wrote: > > What is it I'm missing in thinking that (URI) tags containing > > pct-encoded characters are: > > (a) self-defeating -- tags are supposed to be tractable for humans > > (b) redundant -- it's never necessary to turn a tag containing, say, > > Chinese characters into URI form; we need be sure only that it's in > > canonical form and thus comparable with other tags. > >This makes it look like 'tag' doesn't have general applicability >as a URI, as it can not generally be used in contexts that accept >URIs but do not accept IRIs. > >Perhaps, in the transition from URIs to IRIs, it would >make sense to allow new URI schemes to be registered that have >this property, but it seems unnecessarily limiting. Yes. I have tried to mitigate this problem a bit by suggesting to Tim that the following rules be added/clarified in the draft: - When generating TAGs (either US-ASCII only or beyond), never use %-encoding. - When converting back from a place where only URIs are allowed (e.g. your example below), try to convert from URIs back to IRIs (i.e. get rid of the %-encoding). This doesn't solve the problem completely, but hopefully does alleviate it. It's still a good idea to say that TAGs are designed to work best in contexts where IRIs are allowed. >It might be useful, as an informational adjunct to the IRI >draft, to do a survey of URI contexts and the state of >application for use of IRIs in those contexts, e.g., within >HTML, inside email headers, in SIP, etc. > >For example, RFC 3106, ECML, has a "URI indicating version >of this set of fields." Can this actually be an IRI? I don't >think RFC 3106 allows that. Can you use a Chinese 'tag' here? >Not unless you allow pct-encoded characters. > >I just picked ECML at random; Yes, it's probably not a very important example, as the number of versions for ECML won't be very big, and the newly defined version URIs will most probably be in all-ASCII for world-wide usage. >I think there are lots of other >protocols that use 'URI' and would need their definition >to be upgraded to allow 'IRI'. Yes. Regards, Martin.
Received on Sunday, 5 September 2004 01:16:04 UTC