- From: Adam M. Costello BOGUS address, see signature <BOGUS@BOGUS.nicemice.net>
- Date: Wed, 18 Feb 2004 11:35:40 +0000
- To: uri@w3.org, public-iri@w3.org
I wrote: > Another (more general) way out is to introduce an explicit > half-way-house between IRIs and URIs. After some further thought, I would make a few tweaks to that idea. First, percent-encoding would always be allowed in all components of all IRIs; individual schemes would be unable to prohibit percent-encoding anywhere. Second, if an individual scheme restricts a component to contain only a certain subset of Unicode characters (for example, the ASCII subset), scheme-specific IRI consumers would be required to check the component before using it, and fail gracefully if any characters are found outside the subset. (That would prevent IRIs from suffering some of the problems we are now seeing with URIs. In URIs, percent-encoding was prohibited in the host component, and non-ASCII was prohibited in the host component, and there was no requirement telling URI consumers what to do if they should find either of those things in the host component, so now we have different implementations behaving differently when they encounter such things.) The rule for converting an IRI to a URI would be: 1) If you recognize the scheme, then verify that no component contains characters that it's not supposed to contain. If the verification succeeds, then apply whatever conversions are appropriate for each component. 2) If the verification failed, or if you didn't recognize the scheme, then perform the generic conversion to percent-encoded UTF-8 as described in the IRI draft, and prepend the prefix i- to the scheme. (The prefix i- is a better choice than my previous suggestion of i: because it is less prone to interact strangely with relative references. The prefix could be registered as an "alternate tree" as described in RFC-2717.) To resolve an i-* URI, you conceptually convert it back to an IRI, then redo the IRI-to-URI conversion using the scheme-specific knowledge that was lacking in the earlier IRI-to-URI conversion. Of course an implementation might use a more direct route. AMC http://www.nicemice.net/amc/
Received on Wednesday, 18 February 2004 06:35:41 UTC