- From: Gisle Aas <gisle@ActiveState.com>
- Date: 24 Jul 2003 00:02:01 -0700
- To: uri@w3.org
Some comments on the remove_dot_segment prescribed in rfc2396bis:
| The pseudocode also refers to a remove_dot_segments routine for
| interpreting and removing the special "." and ".." complete path
| segments from a referenced path. This is done after the path is
| extracted from a reference, whether or not the path was relative, in
| order to remove any invalid or extraneous dot-segments prior to
| forming the target URI. Although there are many ways to accomplish
| this removal process, we describe a simple method using a separate
| string buffer:
|
| 1. The buffer is initialized with the unprocessed path component.
|
| 2. If the buffer begins with "./" or "../", the "." or ".." segment
| is removed.
Drop the word "segment" here or add it to step 4 as well.
| 3. All occurrences of "/./" in the buffer are replaced with "/".
I think it should borrow some phrasing from step 5, which makes it:
All occurrences of "/./" in the buffer are iteratively replaced
until no matching pattern remains.
Otherwise it is not clear how /././ is replaced.
| 4. If the buffer ends with "/.", the "." is removed.
|
| 5. All occurrences of "/<segment>/../" in the buffer, where ".." and
| <segment> are complete path segments, are iteratively replaced
| with "/" in order from left to right until no matching pattern
| remains. If the buffer ends with "/<segment>/..", that is also
| replaced with "/". Note that <segment> may be empty.
|
| 6. All prefixes of "<segment>/../" in the buffer, where ".." and
| <segment> are complete path segments, are iteratively replaced
| with "/" in order from left to right until no matching pattern
| remains. If the buffer ends with "<segment>/..", that is also
| replaced with "/". Note that <segment> may be empty.
Can there actually be more than 1 prefix like this? Once it is
replaced it can not match again as the buffer now starts with "/".
| 7. The remaining buffer is returned as the result of
| remove_dot_segments.
If the buffer starts out as "a/../../c" then this algorithm ends up
with "a/c" (step 5 kills the "/../.."). I don't think that is the
intention. Shouldn't step 5 and 6 be swapped?
Regards,
Gisle Aas
Received on Thursday, 24 July 2003 03:03:49 UTC