- From: Gisle Aas <gisle@ActiveState.com>
- Date: 24 Jul 2003 00:02:01 -0700
- To: uri@w3.org
Some comments on the remove_dot_segment prescribed in rfc2396bis: | The pseudocode also refers to a remove_dot_segments routine for | interpreting and removing the special "." and ".." complete path | segments from a referenced path. This is done after the path is | extracted from a reference, whether or not the path was relative, in | order to remove any invalid or extraneous dot-segments prior to | forming the target URI. Although there are many ways to accomplish | this removal process, we describe a simple method using a separate | string buffer: | | 1. The buffer is initialized with the unprocessed path component. | | 2. If the buffer begins with "./" or "../", the "." or ".." segment | is removed. Drop the word "segment" here or add it to step 4 as well. | 3. All occurrences of "/./" in the buffer are replaced with "/". I think it should borrow some phrasing from step 5, which makes it: All occurrences of "/./" in the buffer are iteratively replaced until no matching pattern remains. Otherwise it is not clear how /././ is replaced. | 4. If the buffer ends with "/.", the "." is removed. | | 5. All occurrences of "/<segment>/../" in the buffer, where ".." and | <segment> are complete path segments, are iteratively replaced | with "/" in order from left to right until no matching pattern | remains. If the buffer ends with "/<segment>/..", that is also | replaced with "/". Note that <segment> may be empty. | | 6. All prefixes of "<segment>/../" in the buffer, where ".." and | <segment> are complete path segments, are iteratively replaced | with "/" in order from left to right until no matching pattern | remains. If the buffer ends with "<segment>/..", that is also | replaced with "/". Note that <segment> may be empty. Can there actually be more than 1 prefix like this? Once it is replaced it can not match again as the buffer now starts with "/". | 7. The remaining buffer is returned as the result of | remove_dot_segments. If the buffer starts out as "a/../../c" then this algorithm ends up with "a/c" (step 5 kills the "/../.."). I don't think that is the intention. Shouldn't step 5 and 6 be swapped? Regards, Gisle Aas
Received on Thursday, 24 July 2003 03:03:49 UTC