- From: David Hopwood <david.hopwood@zetnet.co.uk>
- Date: Fri, 19 Sep 2003 13:52:35 +0000
- To: web-calculus@waterken.com, uri@w3.org
-----BEGIN PGP SIGNED MESSAGE----- [Context for the URI WG list: we are talking about whether two URIs such as the following are equivalent, in the sense that an URI processing application is permitted to convert the former to the latter: https://+abc@example.com/ https://%2Babc@example.com/ while claiming not to have changed which resource the URI points to. Note that '+' is in the <reserved> production, but is not specifically reserved in the <authority> or <userinfo> component. If this were possible, then there would be a security problem in a proposed application, so we need to distinguish between the answers "definitely no" vs. "yes or maybe". It would also be useful to know whether the answer is different for RFC 2396bis as compared to 2396.] Tyler Close wrote: > On Friday 19 September 2003 00:23, David Hopwood wrote: > > "+" is not reserved in the <authority> component: > > Yes, and section 2.2 of RFC 2396bis says: > > "Allowed reserved characters that are not assigned a sub-component > delimiter role by this specification should be considered reserved > for special use by whatever software generates the URI (i.e., they > may be used to delimit or indicate information that is significant > to interpretation of the identifier, but that significance is > outside the scope of this specification)." RFC 2396 did not include this paragraph, and 2396 is what existing proxies, firewalls, etc. may be implementing. > That's exactly what we want to do, so we should be using one of > the reserverd characters that is not already assigned a meaning > within the <authority> component. The '+' character fits the bill. > > > Also not the issue. There's no way to guarantee that it isn't escaped > > by any URI processing applications (including proxies, firewalls, etc.) > > RFC 2396bis specifically forbids the software from escaping a > reserved character. I interpret that as meaning a character that is reserved within each particular field: 2.4.2 When to Escape and Unescape Under normal circumstances, the only time that characters within a URI string are escaped is during the process of generating the URI from its component parts. Each component may have its own set of characters that are reserved, so only the mechanism responsible for generating or interpreting that component can determine whether or not escaping a character will change its semantics. The exception is when a URI is being used within a context where the unreserved "mark" characters might need to be escaped, such as when used for a command-line argument or within a single-quoted attribute. Once generated, a URI is always in an escaped form. When a URI is resolved, the components significant to that scheme-specific resolution process (if any) must be parsed and separated before the escaped characters within those components can be safely unescaped. In some cases, data that could be represented by an unreserved character may appear escaped; for example, some of the unreserved "mark" characters are automatically escaped by some systems. A URI normalizer may unescape escaped octets that are represented by characters in the unreserved set. For example, "%7E" is sometimes used instead of tilde ("~") in an "http" URI path and can be converted to "~" without changing the interpretation of the URI. My reading of both 2396 and 2396bis is that an URI processor is allowed to parse an URI and then reconstruct it, and that in the process of reconstruction it may escape any character that is not reserved within each field. Note that we're not talking about what an URI processor should do; only what it could possibly do without being nonconformant. > Do you know of any important software that violates all of the URL > specifications on this topic? I don't think this behaviour violates the spec (unfortunately). And for a security issue, all software is important. - -- David Hopwood <david.hopwood@zetnet.co.uk> Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/ RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01 Nothing in this message is intended to be legally binding. If I revoke a public key but refuse to specify why, it is because the private key has been seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv iQEVAwUBP2sJ2DkCAxeYt5gVAQGyFwgAti9eM6LoLTmv9Quxz5jbJ7f/Yc/CYx4d QH6+JUaBNAQI0KYiyHYHDuZZdRR/B0uOQrG4PqGaEbs4UywoJcHNV1vb1BI/ucbT HHKMraPwN9r0nAgM6UJnaZh8035cpEjFKboj8O1qsY7EtxgCtlry6SJAKVBWdvLC Z4hj44ZM09VpUPKmNT3BMyxCYRYXI8yEhk5/ZugI6eyIPCtXPMKIbyzmMBINHNh5 JNTK4F/OZhfuFn/oqkKZW72MUXyR0zP7BbKVfq2/VNzPF6FDq193qBqgXi79fHg6 Y5ULPJrS02ldwT9mVdwgyz51xGKaXfU5Zm/1zU8crrQEMIYJaG7hfA== =X4Os -----END PGP SIGNATURE-----
Received on Sunday, 21 September 2003 15:48:40 UTC