W3C home > Mailing lists > Public > uri@w3.org > November 2004

Re: remove dot segment

From: Graham Klyne <GK@ninebynine.org>
Date: Fri, 19 Nov 2004 10:52:22 +0000
Message-Id: <>
To: "Martin Balaz" <balaz@ii.fmph.uniba.sk>, <uri@w3.org>

At 17:17 18/11/04 +0000, Martin Balaz wrote:

>I would like to discuss one old problem of the remove_dot_segments function,
>which is not yet solved as I know.
>Following URIs are valid with the respect to the latest rfc2396bis:
>         scheme = "file"
>         authority = not defined
>         path = "/x/..//y"
>         query = not defined
>         fragment = not defined
>         scheme = "file"
>         authority = not defined
>         path = "x/..//y/"
>         query = not defined
>         fragment = not defined
>The result of applying the remove_dot_segments function is in the first case
>"file://y". Segment "y" is considered to be an authority instead a segment
>of the path "//y".
>The result in the second case is "file:/y/", where the path "/y/" is an
>absolute path. It is not a relative path beginning with an empty segment (in
>addition this is not allowed).

My implementation, which I believe to be closely based on the current spec, 
also yields these results, which do seem to be counterintuitive:

tn06str = "file:/x/..//y"
tn06nrm = "file://y"

tn07str = "file:x/..//y/"
tn07nrm = "file:/y/"

I tried experimenting with the fix you suggest... it achieved the suggested 
result for the above test cases, but it also broke two other test cases:

testRelative84 = testRelJoin "testRelative84"
                     "f:/a" ".//g"
yields "f:/g".
(hmmm... the original is also arguably incorrect.)

testRelative85 = testRelJoin "testRelative85"
                     "f://example.org/base/a" "b/c//d/e"
yields "f://example.org/base/b/c/d/e"


I'm not yet sure what would be the best way to resolve this.

Two other suggestions for consideration:
(a) any path starting with '//' be preceded by ./, giving .///
(b) apply (a) only when the authority component is absent.


>I suggest to treat with the empty segments of the form "//" in the same way
>as with the segments "/./". If we replace in the input buffer every
>occurence of "//" by "/./" before applying the remove_dot_segments function,
>we get the intuitive result "file:/y" for the first case and "file:y/" for
>the second. Other empty segments not appearing at the beginning, which don't
>cause in general any troubles, are of course removed too. Empty segment at
>the end of the path is the only one exception.
>This approach solves problem with empty segment at the beginning of the path
>and introduces a normalization form for URIs, which don't contain empty
>Problem can occur only if for another schemes does not hold that empty
>segments have the same meaning as dot segments.

Graham Klyne
For email:
Received on Friday, 19 November 2004 11:20:00 UTC

This archive was generated by hypermail 2.4.0 : Sunday, 10 October 2021 22:17:46 UTC