W3C home > Mailing lists > Public > public-iri@w3.org > September 2008

Path Normalization Causing Issues

From: Geoffrey Sneddon <foolistbar@googlemail.com>
Date: Tue, 9 Sep 2008 18:03:02 +0100
Message-Id: <BEB49C7C-B262-4346-A5B9-277283A1B78E@googlemail.com>
To: public-iri@w3.org

If you try and normalize the following two IRIs:


You end up with:


Then resolve the former as relative to the latter:


This this is per section of RFC3987:
> The complete path segments "." and ".." are intended only for use  
> within relative references (section 4.1 of [RFC3986]) and are  
> removed as part of the reference resolution process (section 5.2 of  
> [RFC3986]). However, some implementations may incorrectly assume  
> that reference resolution is not necessary when the reference is  
> already an IRI, and thus fail to remove dot-segments when they occur  
> in non-relative paths. IRI normalizers should remove dot-segments by  
> applying the remove_dot_segments algorithm to the path, as described  
> in section 5.2.4 of [RFC3986].

As ".." is an IRI, it can be normalized, which results in "". This is  
obviously problematic. Should path segment normalization only be done  
when there is a scheme and/or authority?

Geoffrey Sneddon
Received on Tuesday, 9 September 2008 17:03:41 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:39:39 UTC