- From: Joseph Anthony Pasquale Holsten <joseph@josephholsten.com>
- Date: Mon, 4 Jan 2010 15:12:01 -0600
- To: uri@w3.org
"Charles Lindsey" <chl@clerew.man.ac.uk> said: > Draft-ellermann-news-nntp-uri-11.txt is currently going through AUTH48 > and, since Frank Ellermann seems not to have been heard from for more > than a year, and cannot be contacted, I am getting the job of seeing > what needs to be done (most notably changes necessitated by the AUTH48 > changes in RFC 5536). Sorry to hear Ellerman hasn't turned up. I'm glad you're pushing forward. > I find the question of just what needs to be percent-emcoded is hard to > deduce from RFC 3986. Clearly, anything in <gen-delims> MUST be > percent-encoded except when used as delimiters, so that agents can > divide a URI into scheme, authority, path, query, and fragment > components even before they recognise that it is a news or nntp URI. > But is it REQUIRED for the <sub-delims> if the particular scheme does > not use any of them as delimiters? RFC 3986 seems to imply not, so I > would expect that in > news:foo@bar.!#$%&'*+/=?^`{|}.example > (yes, "bar.!#$%&'*+/=?^`{|}.example" is a valid <dot-atom-text> and > hence can occur in a Message-ID) I would have to percent-encode the > '#'. '/' and '?', but not the others. Frank seems to have taken the > view that all <sub-delims> need to be encoded, though he does at one > point permit '*' to appear unencoded (and it was indeed explicitly > allowed in RFC 1738), which appears to be inconsistent wuth his stance > elsewhere > > And he also includes an example > news://news.gmane.org/p0624081dc30b8699bf9b@%5B10.20.30.108%5D > where I would have thought he could have shown > news://news.gmane.org/p0624081dc30b8699bf9b@[10.20.30.108] > > So exactly what latitude does RFC 3986 permit in these situations? If you do not require expressing any of reserved in your segments, you have no need to allow percent encoding in the definitions of those segments. Sub-delims don't need to be percent encoded unless you are using them as delimiters. But practically, you need to write the definitions to allow percent-encoding in all your segments. Looking at your section 4, your news: syntax is quite busted. At the moment, it does not allow percent encoding for characters that don't have to be encoded. I can appreciate not wanting to allow "." to be percent encoded in mid-left, but mid-atext is just asking for naive implementors to build invalid news uris. RFC3986§2.4 explicitly mentions that, 'For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.' IMHO, it's often not worth defining these things quite so formally at the URI level. I'd rather you just say that an article must (not necessarily completely) percent encoded representation of a usefor msg-id-core than the hoops you're jumping through now. Few people actually write their parsers to the grammars in these specs, so they'll usually be catching this error later in processing. I figure you're dealing with existing implementations, so it's a better use of time to point them at usefor and mention any caveats. If you're going to be rigorous, then you'll need to define segments like article and newsgroups with the exact same syntax as usefor, being liberal in which delimiters are allowed. They should also include the entire range of allowable percent-encoded triplets. Then list all the things that they SHOULD NOT put into URIs, like percent-encoding something in ALPHA. Which brings us to a higher level critique of the operational semantics defined by this spec. Are these URIs just for identifying articles, like a urn:uuid: or urn:sha1:? Should a user agent be able to retrieve arbitrary articles? What happens when I try to access the mythical <news:foo@bar.!#$%&'*+/=?^`{|}.example>? Does this refer to NNTP messages being sent? What are the error conditions that may be caused by the impedance between URIs and NNTP? I see some mention of failure in the security considerations, so that's good. Thinking about how the user agents actually handle URIs is the best guidance for writing these specs. As for your two examples, both open fine in my newsreader. I'd hate for the spec to disagree without just cause. -- Joseph Holsten http://josephholsten.com mailto:joseph@josephholsten.com tel:+1-918-948-6747
Received on Monday, 4 January 2010 21:25:36 UTC