W3C home > Mailing lists > Public > uri@w3.org > March 2004

RE: Are IDNs allowed in http IRIs?

From: Michel Suignard <michelsu@windows.microsoft.com>
Date: Sun, 21 Mar 2004 18:37:55 -0800
Message-ID: <84DD35E3DD87D5489AC42A59926DABE90667D93C@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
To: <public-iri@w3.org>, <uri@w3.org>

Larry, it is either that (scheme-specific knowledge for the schemes that use 'host' when mapping) or the IRI spec needs to have an appendix which updates all the RFCs that currently use 'host' in their authority when specifying scheme syntax (such as RFC 2616 for http). I really don't mind either way, although the 2nd alternative could be quite a lot of work. BTW I am not sure I understand your following message:
<<
For example, a 'data' IRI might need a separate specification for the default MIME type (since text/plain defaults US-ASCII). And an IRI URN might also have a different mapping, etc.
>>
Could you elaborate?

Michel
> -----Original Message-----
> From: Larry Masinter [mailto:LMM@acm.org] 
> 
> It seems like a pretty big change to the IRI concept to have 
> IRI -> URI transformations use scheme-specific knowledge.
> Formerly, IRI -> URI transformation was specified as scheme 
> independent.
> 
> I understand that this seems necessary because of IDN, but 
> it's a big concern.
> 
> And to use "SHOULD" would leave us subject to the 
> indeterminate knowledge of which method is going to be used.
> 
> If you believe that IRI->URI needs to be scheme specific, 
> then is it really tractable to define IRI as a generic concept?
> Do we need a separate spec for "http:", "mailto:", "ftp:" 
> IRIs, where each specifies the punycode vs. hex-encoding of 
> the various parts? For example, a 'data' IRI might need a 
> separate specification for the default MIME type (since 
> text/plain defaults US-ASCII). And an IRI URN might also have 
> a different mapping, etc.
> 
> Larry
> 
> 
> > 
> > 
> > Adam, I think you have a valid point, I would however make 
> a simpler 
> > suggestion, which is two fold:
> > 
> > - introduce the concept of IRI used as presentation element of URI 
> > protocol element. In that sense http://josť.example.net/ is a 
> > presentation element for the following protocol element 
> > http://xn--jos-dma.example.net/ and as you noted 
> > http://jos%C3%A9.example.net/ is not a correct URI (per RFC 2616 
> > referring to host itself defined in RFC 2396). I have 
> suggested text 
> > in that sense to the IRI main editor (Martin). Having the 
> concept of 
> > presentation element validates http IRIs which exist de facto, 
> > whatever we like it or not.
> > 
> > - add text in the IRI spec saying the following:
> > <<
> > When an IRI is converted to a URI, the conversion SHOULD use 
> > scheme-specific knowledge to convert appropriate components 
> where the 
> > scheme syntax prevents the usage of percent-encoded text into such 
> > components. Lack of scheme-specific knowledge (or failure 
> to use it) 
> > can cause valid IRIs to be converted to invalid URIs that contain 
> > percent-encoded non-ASCII text where they are not permitted.
> > >>
> > It is my opinion that anybody in its right mind would 
> implement IRI to 
> > URI mapping considering all the schemes where 'host' is 
> used and map 
> > accordingly (ie use Punycode).
> > My text avoids direct reference to ACE which is in my opinion 
> > unnecessary in the IRI spec and also makes the suggestion to use 
> > scheme aware mapping much stronger (SHOULD instead of MAY). It is a 
> > SHOULD instead of a MUST simply because a scheme may be 
> updated in the 
> > future, making the scheme awareness eventually not necessary.
> > 
> > Michel
> 
> 
Received on Sunday, 21 March 2004 21:38:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:32 GMT