- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sat, 25 May 2002 00:44:27 +0200
- To: Karl Dubost <karl@w3.org>
- Cc: uri@w3.org
* Karl Dubost wrote: >The question is that BBedit has a mechanism to automatically >translate the URIs in a document when it's inside an href. > >+ For example when you have typed > <a href="http://www.example.org/foo?toto=3&tata=4">A request</a> > >BBedit will convert it to > <a href="http://www.example.org/foo?toto=3[&]tata=4">A request</a> BBedit corrects the HTML representation of the URI, but does not translate the URI itself. >+ But if you have typed > <a >href="http://www.example.org/foo?http://www.example.net/path/index.html">A >request</a> > >BBedit is complaining with the message: >Value of attribute "href" for element "<a>" is invalid; URL path >needs encoding ("/foo?http: >%2F%2Fwww.example.net%2Fpath%2Findex.html"). I tend to disagree, see section 2.2 of RFC 2396: [...] If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI. [...] Let's take some example URIs: [1] http://www.example.org/?foo=bar&baz=& [2] http://www.example.org/?foo=bar&baz=%26 [3] http://www.example.org/?foo=bar;baz=& [4] http://www.example.org/?foo=bar;baz=%26 The query consists of key/value pairs. [1] foo = <bar> | baz = <> | <> = <> [2] foo = <bar> | baz = <&> | [3] foo = <bar> | baz = <&> | [4] foo = <bar> | baz = <&> | In [1] the ampersand seperates pairs, it has three pairs. In [2] there are only two pairs, the ampersand is now recognized as data, not as separator, in [3] and [4] the semicolon seperates pairs, it does not matter whether the ampersand is escaped or not. Your example http://www.example.org/foo?http://www.example.net/path/index.html is a syntactically valid URI, since it matches the production rules of RFC 2396 (and RFC 2616 defining the http: URI scheme). The RFC 2396 point is, http://www.example.org/%66%6F%6F is equivalent to http://www.example.org/foo since [fo] is not in the set of unsafe and reserved characters, but http://www.example.org/%3Ffoo is not equivalent to http://www.example.org/?foo since '?' is in one of the mentioned sets. The matter is IMO ambiguity, using http://www.example.org/foo?http%3A%2F%2Fwww.example.net%2F clearly indicates [:/] are data and have no special meaning, while http://www.example.org/foo?http://www.example.net/ does not, maybe ':' is a separator here or something. The latter needs additional interpretation not outlined in RFC 2396 in order to claim equivalence to the former, however, syntactically valid are both URIs.
Received on Friday, 24 May 2002 18:45:12 UTC