- From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
- Date: Thu, 16 Sep 2004 18:22:37 +0900
- To: public-iri@w3.org
Hello Martin, > I have tentatively closed this issue. Please see > http://www.w3.org/International/iri-edit/diff-duerst-iri-last-draft.html > for the overall changes, and tell me whether you are okay, as > soon as possible. Thank you for making the changes, they do address my concern. A few short comments below on the URI upgrading para, but no requests for further change. By all means close the issue. Best regards Stuart -- > -----Original Message----- > From: Martin Duerst [mailto:duerst@w3.org] > Sent: 16 September 2004 08:01 > > Hello Stuart, > > Sorry for the delay in responding to your mail. Not a problem... > > At 10:49 04/08/19 +0100, Williams, Stuart wrote: > > >Hello Martin, > > > > > -----Original Message----- > > > From: public-iri-request@w3.org > > > [mailto:public-iri-request@w3.org] On Behalf Of Martin Duerst > > > Sent: 18 August 2004 08:03 > > > To: Williams, Stuart > > > Cc: public-iri@w3.org; Ted Hardie > > > Subject: RE: URI schemes and IRI deployment (issue schemes-iri-38) > > > > > > > > > ><snip/> > > > > Okay, it looks like I wasn't precise enough. Let me try a proposal > > > for rewording, for the middle sentence in the paragraph above: > > > > > > "The main case where upgrading a scheme definition makes sense is > > > when a scheme definition is strictly limited to the use of US-ASCII > > > characters with no provision to include non-ASCII characters/octets > > > via percent-encoding, or if a scheme definition currently uses > > > highly scheme-specific provisions for the encoding of non-ASCII > > > characters." > > > > > > Would that be better? Would the changes below still be necessary? > > > I wouldn't want to replace the above with your text below, because > > > your text below says nothing about schemes that may or may not have > > > to be upgraded. > > > >Hmmm... if you were to include the reworded para below (I've agreed to > >the rewording - fewer 'generally's) I think you could simply delete > >this paragraph. > > I have thought about that. I think the current paragraph, > talking about upgrades, is valuable in its own right, > although it is not the issue you have raised. So I'll leave that in. Ok... some comments below... but I can live with the para. > >On the surface it is ok, but if I were to say, think of upgrading an > >existing scheme and saying that going forward, %-encoded characters > >should be interpreted as UTF-8 I find myself wondering about backward > >compatibility issues, where %-encoding may have been used in > >identifiers without that intended interpretation. I'm not at all sure > >how possible it is to 'upgrade' any URI scheme. > > If it was the case that %-encoding was used with a fixed > character semantics that is different from UTF-8 (let's take > iso-8859-1 as an example), then you are right [I don't know > of such a scheme, but that doesn't mean that it might not > exist.]. In practice, it may still be possible to add such > semantics for newly created URIs because there are very good > heuristics for UTF-8. I thinks it's more common that the character set/encoding is just not known. > Also, if %-encoding was used without any defined character > semantics (typical example: HTTP), then it would be > impossible to force UTF-8 character semantics on %-encoding. > Again, in practice, a scheme definition may be updated to say > something like 'if it looks like UTF-8, assume it's UTF-8'. Personnally I'd be conservative on both these counts... if the scheme doesn't give you a mechanism to know the character set/encoding, don't guess. > Anyway, that's why the text in the draft is very careful to > limit this to the case where a scheme (or a part thereoff) > does not allow %-encoding, or uses other conventions for encoding non-ASCII characters. > In these cases, %-encoding is essentially added as new syntax > to the scheme. The benefits of extending the syntax of a > scheme have to be judged carefully, but it's not something > that is a priory impossible. Ok, I can now see the intent behind the "...strictly limited to the use of US-ASCII characters with no provision to include non-ASCII characters/octets via percent-encoding,.." That seems to me like to be a small set of schemes too. Do any actually prohibit the use of %-encoding? > > > "URI schemes can impose restrictions on the syntax of > > > scheme-specific URIs, ie. URIs that are admissable under the generic > > > URI syntax [RFCYYYY] may not be admissable due to narrower syntactic > > > constraints imposed by a URI scheme specification. URI scheme > > > definitions cannot broaden the syntactic restrictions of the generic > > > URI syntax, otherwise it would be possible to generate URIs that > > > satisfied the scheme specific syntactic constraints without > > > satisfying the syntactic constraints of the generic URI syntax. > > > However, additional syntactic constraints imposed by URI scheme > > > specifications are *indirectly* applicable to IRI since the > > > corresponding URI resulting from the mapping defined in Section 3.1 > > > MUST be a valid URI under the syntactic restrictions of generic URI > > > syntax and any narrower restrictions imposed by the corresponding > > > URI scheme specification." > > > >Inclusion of this paragraph, as reworded above, would > address my concern. > > I have included this paragraph. I think this is material that > should end up in the 'guidelines for new URI schemes' > or whatever it will be called, and once it end up there, we > may be able to remove it from here, but for the moment, it > doesn't hurt. > > > >Well... I think it needs to be clear to readers of the IRI spec that no > >magic happens that automatically enables them to create schemes that > >allow the *direct* inclusion of a wider range of characters in scheme definitions. > >I made my initial comment after a discussion with Tim Kindberg wrt to the > >tag: URI scheme in draft. He was confused about what he could/could not > >do wrt to internationalisation on defining that scheme. For his > >purposes he would (I believe) like to be able to allow the direct use > >of internationalized characters, and the %encoding. Passed around as > >IRI Tim would get what he wants (provided me makes appropiate > >statements/references about %encoding and UTF-8). > > I agree. Your pointer to Tim's draft helped me a lot > understanding what you were looking for. > > I have tentatively closed this issue. Please see > http://www.w3.org/International/iri-edit/diff-duerst-iri-last- > draft.html > for the overall changes, and tell me whether you are okay, as > soon as possible. > > Regards, Martin. > >
Received on Thursday, 16 September 2004 09:22:56 UTC