Re: draft-duerst-iri-07.txt: 2 week mailing list last call from Graham Klyne on 2004-05-19 (uri@w3.org from May 2004)

From: Graham Klyne <gk@ninebynine.org>
Date: Wed, 19 May 2004 10:10:29 +0100
To: Martin Duerst <duerst@w3.org>, public-iri@w3.org
Cc: uri@w3.org
Message-Id: <5.1.0.14.2.20040519101014.02c55810@127.0.0.1>
Martin,

For the process, this looks fine to me.

#g
--

At 16:52 19/05/04 +0900, Martin Duerst wrote:
>Hello Graham,
>
>At 14:06 04/05/12 +0100, Graham Klyne wrote:
>
>>At 17:59 12/05/04 +0900, Martin Duerst wrote:
>>>Hello Graham,
>>>
>>>I have labeled this issue as convertASCII-30.
>>>
>>>
>>>At 12:02 04/05/10 +0100, Graham Klyne wrote:
>>>
>>>>Section 3.2:
>>>>
>>>>Is this really true (about always mapping back to the same URI)?:
>>>>[[
>>>>3.2  Converting URIs to IRIs
>>>>
>>>>    In some situations, it may be desirable to try to convert a URI into
>>>>    an equivalent IRI. This section gives a procedure to do such a
>>>>    conversion. The conversion described in this section will always
>>>>    result in an IRI which maps back to the URI that was used as an input
>>>>    for the conversion (except for potential case differences in
>>>>    percent-encoding). However, the IRI resulting from this conversion
>>>>    may not be exactly the same as the original IRI (if there ever was
>>>>    one).
>>>>]]
>>>>
>>>>In light of:
>>>>[[
>>>>    2) Convert all percent-encodings (% followed by two hexadecimal
>>>>       digits) except those corresponding to '%', characters in
>>>>       'reserved', and characters in US-ASCII not allowed in URIs, to the
>>>>       corresponding octets.
>>>>]]
>>>>
>>>>It seems to me that removing percent encodings for non-reserved and 
>>>>other characters is a non-reversible transformation.  I think that 
>>>>mapping back to the original URI is only true under escape 
>>>>normalization, per rfc2396bis.
>>>
>>>Yes, good catch. I looked at the actual text that needs to be fixed.
>>>I can either add non-reserved ASCII characters to the 'except'
>>>clause in parentheses in the original text, or can change the
>>>procedure. Overall, in terms of edits, both need about the same
>>>work. Which one would you prefer?
>>
>>I'm not sure.  I think it's most important to remove the inconsistency.
>
>I have decided that it is better to also remove spurious percent-encodings
>of non-reserved US-ASCII characters, because probably the main use of
>the conversion from URIs to IRIs is for presentation purposes.
>
>I have changed
>
>    (except for potential case differences in percent-encoding)
>
>to
>
>    (except for potential case differences in percent-encoding
>     and for potential percent-encoded unreserved characters)
>
>I have also changed
>
>    This procedure will convert as many percent-encoded non-ASCII
>    characters as possible to characters in an IRI.
>to
>    This procedure will convert as many percent-encoded characters
>    as possible to characters in an IRI.
>
>I hope this addresses your concern.
>
>
>Regards,   Martin.
>
>>I think that, in practice, this is an area which developers and users 
>>would be well-advised to avoid.
>>
>>#g
>>--
>>
>>>It is clear that with or without removing percent-encodings for
>>>non-reserved ASCII characters, this can be done, and different
>>>usages may choose different variants, according to their needs.
>>>
>>>
>>>>Also, not knowing anything about bidi encodings, it's difficult for me 
>>>>to tell if there's any possible interaction between this and the 
>>>>section 4 material on bidi sequences.
>>>
>>>There is some interaction as some characters and character
>>>combinations are excluded by the bidi section. I think the
>>>various cross-references within the text take care of this.
>>>There is also some interaction that with the conversion
>>>from URI to IRI, the display sequence of the components
>>>may change. But this will just happen automatically, this
>>>is not something the algorithm has to worry about.
>>>
>>>
>>>Regards,    Martin.
>>
>>------------
>>Graham Klyne
>>For email:
>>http://www.ninebynine.org/#Contact
>
>------------
>Graham Klyne
>For email:
>http://www.ninebynine.org/#Contact
Received on Wednesday, 19 May 2004 05:52:59 UTC