- From: Thomas Roessler <tlr@w3.org>
- Date: Thu, 19 Mar 2009 16:27:38 +0100
- To: fielding@gbiv.com, John Klensin <klensin@jck.com>, michelsu@microsoft.com, Martin Duerst <duerst@it.aoyama.ac.jp>
- Cc: Dan Connolly <connolly@w3.org>, "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, Richard Ishida <ishida@w3.org>, Henry Thompson <ht@inf.ed.ac.uk>, public-iri@w3.org
- Message-Id: <17B2D6B6-8CAD-489F-AB72-7847BC20C990@w3.org>
Gentlemen, excuse the direct spam, please; for archival purposes, I'm also copying the public-iri list on this note. http://lists.w3.org/Archives/Public/public-iri/ It seems like RFC 3987 is mandating the use of the UseSTD3ASCIIRules flag when using ToASCII to map an IRI into a URI; see: http://tools.ietf.org/html/rfc3987#section-3.1 > Replace the ireg-name part of the IRI by the part converted using > the ToASCII operation specified in section 4.1 of [RFC3490] on > each dot-separated label, and by using U+002E (FULL STOP) as a > label separator, with the flag UseSTD3ASCIIRules set to TRUE, and > with the flag AllowUnassigned set to FALSE for creating IRIs and set > to TRUE otherwise. A quick search in various archives suggests that the genesis of the UseSTD3ASCIIRules flag in 3987 relates to what is now section 3.2.2 of RFC 3986 (at the time, section 3.2.2 of RFC 2396): http://www.imc.org/idn/mail-archive/msg07277.html http://tools.ietf.org/html/rfc3986#section-3.2.2 Interestingly, following through on the references from there effectively brings us to the name production in appendix B of RFC 952, http://tools.ietf.org/html/rfc952 -- which in turn forbids the double hyphen in a name. That's, of course, a really fine restriction on registrations, but it could (absurdly) be read to prohibit use of an A-label in URI references. That's certainly not a useful conclusion; the formal syntax in RFC 3986 is actually vague enough to permit them. All this just proves that strict spec lawyering on permissible strings doesn't get us to a useful place here, and that there's probably a need to distinguish registration guidelines from what's permissible in a URI (or IRI) reference. With that, I'm left to wonder what the UseSTD3ASCIIRules restriction in RFC 3987 is meant to achieve? My suspicion would be that we'd actually *not* want to set UseSTD3ASCIIRules when converting from an IRI reference to a URI reference, mostly in order to be conservative about unnecessary restrictions that might bite us later. Finally, empirics: I've done some quick tests with the browsers that I run here (Firefox 3.1 beta3, Opera 9.64, Safari 4 beta [5528.16]). All of these were able to dereference a hyperlink to http:// _test0_α.does-not-exist.org/, i.e., they do not actually set UseSTD3ASCIIRules. Thoughts? Thanks, -- Thomas Roessler, W3C <tlr@w3.org>
Received on Thursday, 19 March 2009 15:27:52 UTC