W3C home > Mailing lists > Public > uri@w3.org > November 2004

RE: remove dot segment

From: Graham Klyne <gk@ninebynine.org>
Date: Fri, 19 Nov 2004 12:37:26 +0000
Message-Id: <5.1.0.14.2.20041119121643.030762b0@127.0.0.1>
To: "Martin Balaz" <balaz@ii.fmph.uniba.sk>, <uri@w3.org>

At 11:54 19/11/04 +0000, Martin Balaz wrote:
>Why do you consider the results ("f:/g" and "f://example.org/base/b/c/d/e")
>to be not intuitive?

I didn't mean to claim *that* case was unintuitive, though I also don't 
think they're obviously correct.  The results I quoted were what I achieved 
by implementing the specification as given (or so I believed).  What I 
meant to say was unintuitive was the same issues that you were raising as 
problems;  my way of signalling cautious agreement with you :-)

I personally tend to a view that changes to the path should be minimized, 
and generally replacing // with / seems to go against that principle.  I 
also think anything that amounts to a substantive change to the spec's 
treatment of URIs should be avoided to the maximum extent possible with 
fixing any "obvious" breakages.

I believe there are some URIs that use '//' in the path where that is 
treated as significant by software;  e.g. file: URIs with Windows UNC names?

>[[
>testRelative84 = testRelJoin "testRelative84"
>                      "f:/a" ".//g"
>                      "f://g"
>]]
>
>If we take another example closer to the real world ;), by ex.
>"file:/index.php" and ".//images":
>In the UNIX systems and also in the Windows systems path ".//images" is
>equivalent with the path "./images". Dot segment "." means current directory
>and thefore "file:/images" is exactly the result I expect.
>Similarly the second example.
>
>I agree that prefixes of the form "./" or "/./" can solve the segment
>problem, but:

Er, I think that idea was broken...

>[[
>file:/x/..//y
>result you suggest: file:/./y
>]]
>
>If you apply the remove_dot_segments second time, you get the result
>"file:/y". It would be nice to have property
>remove_dot_segments(remove_dot_segments(Path)) = remove_dot_segments(Path).
>And if the empty segment in the beginning is equivalent with the dot segment
>"." (which can be removed), why not in the rest of the path?

I agree that URI normalization should be idempotent.

Lacking any further inspiration right now.

#g
--

>-----Original Message-----
>From: Graham Klyne [mailto:GK@ninebynine.org]
>Sent: Friday, November 19, 2004 10:52 AM
>To: Martin Balaz; uri@w3.org
>Subject: Re: remove dot segment
>
>At 17:17 18/11/04 +0000, Martin Balaz wrote:
>
> >I would like to discuss one old problem of the remove_dot_segments
>function,
> >which is not yet solved as I know.
> >
> >Following URIs are valid with the respect to the latest rfc2396bis:
> >
> >file:/x/..//y
> >         scheme = "file"
> >         authority = not defined
> >         path = "/x/..//y"
> >         query = not defined
> >         fragment = not defined
> >
> >file:x/..//y/
> >         scheme = "file"
> >         authority = not defined
> >         path = "x/..//y/"
> >         query = not defined
> >         fragment = not defined
> >
> >The result of applying the remove_dot_segments function is in the first
>case
> >"file://y". Segment "y" is considered to be an authority instead a segment
> >of the path "//y".
> >The result in the second case is "file:/y/", where the path "/y/" is an
> >absolute path. It is not a relative path beginning with an empty segment
>(in
> >addition this is not allowed).
>
>My implementation, which I believe to be closely based on the current spec,
>also yields these results, which do seem to be counterintuitive:
>
>[[
>tn06str = "file:/x/..//y"
>tn06nrm = "file://y"
>
>tn07str = "file:x/..//y/"
>tn07nrm = "file:/y/"
>]]
>
>I tried experimenting with the fix you suggest... it achieved the suggested
>result for the above test cases, but it also broke two other test cases:
>
>[[
>testRelative84 = testRelJoin "testRelative84"
>                      "f:/a" ".//g"
>                      "f://g"
>]]
>yields "f:/g".
>(hmmm... the original is also arguably incorrect.)
>
>[[
>testRelative85 = testRelJoin "testRelative85"
>                      "f://example.org/base/a" "b/c//d/e"
>                      "f://example.org/base/b/c//d/e"
>]]
>yields "f://example.org/base/b/c/d/e"
>
>...
>
>I'm not yet sure what would be the best way to resolve this.
>
>Two other suggestions for consideration:
>(a) any path starting with '//' be preceded by ./, giving .///
>(b) apply (a) only when the authority component is absent.
>
>#g
>--
>
> >I suggest to treat with the empty segments of the form "//" in the same way
> >as with the segments "/./". If we replace in the input buffer every
> >occurence of "//" by "/./" before applying the remove_dot_segments
>function,
> >we get the intuitive result "file:/y" for the first case and "file:y/" for
> >the second. Other empty segments not appearing at the beginning, which
>don't
> >cause in general any troubles, are of course removed too. Empty segment at
> >the end of the path is the only one exception.
> >
> >This approach solves problem with empty segment at the beginning of the
>path
> >and introduces a normalization form for URIs, which don't contain empty
> >segments.
> >
> >Problem can occur only if for another schemes does not hold that empty
> >segments have the same meaning as dot segments.
> >
> >Martin
>
>------------
>Graham Klyne
>For email:
>http://www.ninebynine.org/#Contact

------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
Received on Friday, 19 November 2004 19:37:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:35 GMT