Re: remove dot segment

On Nov 19, 2004, at 3:12 AM, Martin Balaz wrote:
> OK, let see it from the other view:
>
>> file:/x/..//y
>> file:x/..//y/
>
> URI Reference: ..//y
> Base URI:      file:/x/
> Target URI:    file://y
>
> URI Reference: ..//y/
> Base URI:      file:x/
> Target URI:    file:/y/
>
> Both URI References and also both Base URIs are valid. Although they 
> are
> valid, in both cases the remove_dot_segments function has not intuitive
> results.
> I would also expect that "x/y//../z" is equivalent with "x/z" instead 
> of
> "x/y/z" (Base URI = "x/y//", URI Ref. = "../z").
>
> What to do in those cases?

I don't care, and neither does anyone else.  The purpose of the 
algorithm
is to create a safe and consistent result regardless of the string given
as a reference.  It is not supposed to create good URIs out of obviously
bogus references.  These examples were considered and rejected a year 
ago
because they have no applicability in practice and would needlessly
complicate implementations.

Your intuition is based on "." and ".." having the meaning that they
do for Unix path names, which is simply not the case here.  It is quite
common for "//" to appear in the path portion of other URIs and
it would be an error to remove them.

> I think that we can't assume that the input for
> the remove_dot_segments function (by ex. /x/..//y, x/..//y/ or 
> "x/y//../z")
> is in general valid path.

We don't have to -- the algorithm will result in a single answer and
whether or not it corresponds to a filesystem is irrelevant -- the
dot-segments are operations on the "/" separators in the URI reference,
not on the filesystem.  Under no circumstances will "/x/..//y" ever
be a reference subject to intuition -- it is nonsensical regardless
of the result, and any result is okay is so long as it is the same
result for all parsers.

....Roy

Received on Friday, 19 November 2004 12:17:09 UTC