Re: draft-kindberg-tag-uri

Hello Tim,

At 10:58 04/09/04 +0900, Tim Kindberg wrote:

>Hi Martin,
>
>Thanks for the detailed comments. I'll look over all the suggested 
>clarifications to the text in general but it's the URI/IRI issue that most 
>concerns me. Every time I think I've understood the right way to treat 
>this issue, you say something that suggests otherwise. It's happened again.
>
>In a previous exchange with you I wrote:
> > I was looking for a way for tag to be internationalisation-compatible
> > while adding as little as possible to the spec except in the way of
> > external references to internationalisation specs. I had a closer look
> > at draft-duerst-iri-09 with that in mind.
> >
> > In Section 3 intro and 3.1, you distinguish identifiers that are used
> > for resource retrieval from those that aren't.  Then in 5.1 you make
> > the distinction again, in the context of string comparison.
> >
> > Well, tags are identifiers that are *not* used for resource retrieval.
> > So it seems to me that we fall squarely into the class of identifiers
> > for which it is "not necessary to map the IRI to a URI".
> >
> > This matches my conception of how to treat tags from an
> > internationalisation perspective:  they always and only appear in their
> > IRI forms. So a Chinese tag would look like the example I sent
> > previously -- as a string of Chinese characters (with our separators in
> > between). There is no need to map that into a (2396bis) URI.
> >
> > So I propose to add a little text on Internationalisation as an
> > addendum to our syntax, referring to your IRI draft and saying that our
> > domain name component may be replaced by a IDN (refer to RFC3490);
> > that, when the left -hand side of email addresses gets an international
> > standard, that could be used instead; and that the "specific" part of
> > the tag may be any string of "ipchar" (your draft).
> >
> > I don't think I need to mention percent-encoded UTF-8 (or such) at all.
> > I know the emphasis in the syntactic detail is then rather one-sided,
> > but I'm trying to be pragmatic.
>
>In response to that, you seemed to agree with me. But in your comments on 
>draft 06 below you have put pct-encoded syntax in!

Going back to previous (private) mail, it looks like I understood the
above paragraph about UTF-8 in the context of IRIs in the paragarph
before. In IRIs as such, in particular in cases such as TAG, there
is no need to talk about percent-encoding, although in IRIs in general,
percent-encoding is allowed.


>What is it I'm missing in thinking that (URI) tags containing pct-encoded 
>characters are:
>(a) self-defeating -- tags are supposed to be tractable for humans

yes. The pct-encoded in the tag URI syntax is only for defining tag
IRI syntax (see below). I have provided additional text in my comments
to make this clear.


>(b) redundant -- it's never necessary to turn a tag containing, say, 
>Chinese characters into URI form; we need be sure only that it's in 
>canonical form and thus comparable with other tags.

I more or less agree with you. The main reason for defining URIs
with pct-encoded octets is because in order to use tags with IRIs,
we have to define a tag URI scheme (there is no process to define
IRI schemes; IRI schemes don't exist independently of URI schemes).

And to define what is allowed and what is not in a tag IRI, we have
to do that by defining the syntax of tag URIs and relying on the
conversion at
http://www.w3.org/International/iri-edit/draft-duerst-iri.html#mapping
to derive what are acceptable tag IRIs. The above cited section says
that the conversion from IRIs to URIs has two purposes:
a) Syntactic
b) Interpretational (for resolution)
You are correct that b) doesn't apply to tags, but a) still applies.

See also my comments to Larry's mail.

Regards,    Martin.

Received on Sunday, 5 September 2004 00:57:02 UTC