- From: Manger, James H <James.H.Manger@team.telstra.com>
- Date: Mon, 29 Oct 2007 14:59:58 +1100
- To: <uri@w3.org>
How should variable values of “.” and “..” be handled? My suggestion: It is an error for a variable value or be “.” or “..” when the URI up to the position where the value is to be inserted matches the regular expression “|[^?]*/\.?”. That is, the URI: * is empty; or * ends with “/” or “/.” and does not contain a “?”. Note: it is each variable value that is checked, not the entire replacement for a {…} segment. The potential problem is that “.” and “..” have special meaning in URIs, which the template designer (web server) probably does not want. Consider /car/{make}/{model}/prices.html. If make=“..” and model=“truck” the URI is /car/../truck/prices.html, which is normalized to /truck/prices.html. This is an unexpected result given the template. If make=“ford” and model=“.” The URI is /car/ford/./prices.html, which is normalized to /car/ford/prices.html. Again, not quite an obvious result given the template. A “.” may not be much of an issue in practice as // is generally treated as /, as is /./ of course. Consequently, “.” would be similar in practice to “” (an empty string). I did some quick tests of URLs with consecutive ////’s. Neither Firefox nor IE browsers normalized the /’s before sending the request, but both Apache and Tomcat servers treated it just like a single / (eg return the same resource). The java.net.URI.normalize() method actually replaces //// with / (but that is a bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4723726). http://bitworking.org//news////258/////The-end-of-the-AtomPub-WG works, for example. I did find one web server (Pebble blog software, a Java web app) that treated //// differently than /. It returned 404 Not Found for //// (it has its own application-specific URL mapping layer). A server COULD treat /car/ford//prices.html and /car/ford/prices.html differently, but some (most?) will not by default. “.” and “..” are NOT special everywhere in a URI. For instance, /car/prices.html?make=..&model=truck is unchanged by normalization. Similarly, /car/{make}_{model}/prices.html -> /car/.._truck/prices.html is unchanged by normalization. Hence, a blanket prohibition on “.” and “..” as variable values would be unnecessarily restrictive. %-encoding the dots does not help. Enter http://www.w3.org/%2E/ into the Firefox address bar, for instance, and it normalizes away the ./ before sending the request. An implementation complication… A URI will invariably be built from a template from left to right. To know if a “..” substitution will have a special meaning you have to know what comes next. For instance, with make=“..”, /car/{make}_{model}/ is not a problem but you have to notice the underscore to realise this. With /car/{make}{model}/ you cannot tell if make=“..” is a problem until {model} has been substituted. Whatever rules we come up with, they should NOT require implementations to look ahead (certainly not at future substitutions) -- even if this means rejecting a URI such as /car/.._truck/. My template proposal {^ prefix^ var []sep |default} supported 2 encoding modes: %-encode all chars not in <unreserved> (when there is no leading ^); or %-encode only chars not in <unreserved> and not in <reserved> (when leading ^ is present). Not %-encoding /’s makes it harder to detect (and, hence, treat as an error) “..” paths that affect the current URI. Eg, {^foo[]} when foo=“a/../../../” or foo=[“a/.”, “./.”, “./.”]. Perhaps /’s in variable values should always be encoded.
Received on Monday, 29 October 2007 04:00:30 UTC