- From: Graham Klyne <gk@ninebynine.org>
- Date: Wed, 12 May 2004 14:06:21 +0100
- To: Martin Duerst <duerst@w3.org>, public-iri@w3.org
- Cc: uri@w3.org
At 17:59 12/05/04 +0900, Martin Duerst wrote: >Hello Graham, > >I have labeled this issue as convertASCII-30. > > >At 12:02 04/05/10 +0100, Graham Klyne wrote: > >>Section 3.2: >> >>Is this really true (about always mapping back to the same URI)?: >>[[ >>3.2 Converting URIs to IRIs >> >> In some situations, it may be desirable to try to convert a URI into >> an equivalent IRI. This section gives a procedure to do such a >> conversion. The conversion described in this section will always >> result in an IRI which maps back to the URI that was used as an input >> for the conversion (except for potential case differences in >> percent-encoding). However, the IRI resulting from this conversion >> may not be exactly the same as the original IRI (if there ever was >> one). >>]] >> >>In light of: >>[[ >> 2) Convert all percent-encodings (% followed by two hexadecimal >> digits) except those corresponding to '%', characters in >> 'reserved', and characters in US-ASCII not allowed in URIs, to the >> corresponding octets. >>]] >> >>It seems to me that removing percent encodings for non-reserved and other >>characters is a non-reversible transformation. I think that mapping back >>to the original URI is only true under escape normalization, per rfc2396bis. > >Yes, good catch. I looked at the actual text that needs to be fixed. >I can either add non-reserved ASCII characters to the 'except' >clause in parentheses in the original text, or can change the >procedure. Overall, in terms of edits, both need about the same >work. Which one would you prefer? I'm not sure. I think it's most important to remove the inconsistency. I think that, in practice, this is an area which developers and users would be well-advised to avoid. #g -- >It is clear that with or without removing percent-encodings for >non-reserved ASCII characters, this can be done, and different >usages may choose different variants, according to their needs. > > >>Also, not knowing anything about bidi encodings, it's difficult for me to >>tell if there's any possible interaction between this and the section 4 >>material on bidi sequences. > >There is some interaction as some characters and character >combinations are excluded by the bidi section. I think the >various cross-references within the text take care of this. >There is also some interaction that with the conversion >from URI to IRI, the display sequence of the components >may change. But this will just happen automatically, this >is not something the algorithm has to worry about. > > >Regards, Martin. > > ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact
Received on Wednesday, 12 May 2004 09:25:46 UTC