W3C home > Mailing lists > Public > uri@w3.org > June 2014

Percent encoded dots in . and .. path elements

From: ☻Mike Samuel <msamuel@google.com>
Date: Fri, 27 Jun 2014 11:51:45 -0400
Message-ID: <CAHBJ-bk2jAt3sJ89ZkfVyC8BSchO8aju7UujPaBq-puE4wVf9w@mail.gmail.com>
To: uri@w3.org
Apologies if this is not the right forum for RFC 3986 related questions.


Dot ('.') is in the unreserved set, and 3986 says

"""
    URIs that differ in the replacement of an unreserved character with
    its corresponding percent-encoded US-ASCII octet are equivalent: they
    identify the same resource.
"""

which leads me to believe that %2E which encodes dot should be
normalized before interpreting "." and ".." path elements when doing
path resolution.


If so, then resolving
  Base URI:  /x/y/z/
against
  Relative URI: .%2E
should yield
  /x/y/
and not
  /x/y/z/.%2E

The existing libraries that I tested (Java's java.net.URI, Python's
urlparse.urljoin) yield /x/y/z/.%2E and Java's normalize() method does
not recognize the last path element as special.


Browser's seem to differ.  Chrome and Safari seem to normalize ".%2E" early.
Firefox seems to be leaving it up to the protocol handler.
"https://www.google.com/webhp/.%2E" beGETs "www.google.com/."
"file:///Users/msamuel/work/.%2E" fetches the right resource but ".."
shows up as a path element in the URL bar.
"http://urlecho.appspot.com/echo/z/.%2E" beGETs "urlecho.appspot.com/echo/z/.."


Should resolution/normalization treat the path element ".%2E" as special?

cheers,
mike
Received on Friday, 27 June 2014 15:52:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:16 UTC