- From: Martin Duerst <duerst@w3.org>
- Date: Sun, 05 Sep 2004 09:56:57 +0900
- To: Tim Kindberg <timothy@hplb.hpl.hp.com>(by way of Martin Duerst <duerst@w3.org>), uri@w3.org
Hello Tim, At 10:58 04/09/04 +0900, Tim Kindberg wrote: >Hi Martin, > >Thanks for the detailed comments. I'll look over all the suggested >clarifications to the text in general but it's the URI/IRI issue that most >concerns me. Every time I think I've understood the right way to treat >this issue, you say something that suggests otherwise. It's happened again. > >In a previous exchange with you I wrote: > > I was looking for a way for tag to be internationalisation-compatible > > while adding as little as possible to the spec except in the way of > > external references to internationalisation specs. I had a closer look > > at draft-duerst-iri-09 with that in mind. > > > > In Section 3 intro and 3.1, you distinguish identifiers that are used > > for resource retrieval from those that aren't. Then in 5.1 you make > > the distinction again, in the context of string comparison. > > > > Well, tags are identifiers that are *not* used for resource retrieval. > > So it seems to me that we fall squarely into the class of identifiers > > for which it is "not necessary to map the IRI to a URI". > > > > This matches my conception of how to treat tags from an > > internationalisation perspective: they always and only appear in their > > IRI forms. So a Chinese tag would look like the example I sent > > previously -- as a string of Chinese characters (with our separators in > > between). There is no need to map that into a (2396bis) URI. > > > > So I propose to add a little text on Internationalisation as an > > addendum to our syntax, referring to your IRI draft and saying that our > > domain name component may be replaced by a IDN (refer to RFC3490); > > that, when the left -hand side of email addresses gets an international > > standard, that could be used instead; and that the "specific" part of > > the tag may be any string of "ipchar" (your draft). > > > > I don't think I need to mention percent-encoded UTF-8 (or such) at all. > > I know the emphasis in the syntactic detail is then rather one-sided, > > but I'm trying to be pragmatic. > >In response to that, you seemed to agree with me. But in your comments on >draft 06 below you have put pct-encoded syntax in! Going back to previous (private) mail, it looks like I understood the above paragraph about UTF-8 in the context of IRIs in the paragarph before. In IRIs as such, in particular in cases such as TAG, there is no need to talk about percent-encoding, although in IRIs in general, percent-encoding is allowed. >What is it I'm missing in thinking that (URI) tags containing pct-encoded >characters are: >(a) self-defeating -- tags are supposed to be tractable for humans yes. The pct-encoded in the tag URI syntax is only for defining tag IRI syntax (see below). I have provided additional text in my comments to make this clear. >(b) redundant -- it's never necessary to turn a tag containing, say, >Chinese characters into URI form; we need be sure only that it's in >canonical form and thus comparable with other tags. I more or less agree with you. The main reason for defining URIs with pct-encoded octets is because in order to use tags with IRIs, we have to define a tag URI scheme (there is no process to define IRI schemes; IRI schemes don't exist independently of URI schemes). And to define what is allowed and what is not in a tag IRI, we have to do that by defining the syntax of tag URIs and relying on the conversion at http://www.w3.org/International/iri-edit/draft-duerst-iri.html#mapping to derive what are acceptable tag IRIs. The above cited section says that the conversion from IRIs to URIs has two purposes: a) Syntactic b) Interpretational (for resolution) You are correct that b) doesn't apply to tags, but a) still applies. See also my comments to Larry's mail. Regards, Martin.
Received on Sunday, 5 September 2004 00:57:02 UTC