W3C home > Mailing lists > Public > public-iri@w3.org > September 2008

Path Normalization Causing Issues

From: Geoffrey Sneddon <foolistbar@googlemail.com>
Date: Tue, 9 Sep 2008 18:03:02 +0100
Message-Id: <BEB49C7C-B262-4346-A5B9-277283A1B78E@googlemail.com>
To: public-iri@w3.org

If you try and normalize the following two IRIs:

".."
"http://example.com/foobar/"

You end up with:

""
"http://example.com/foobar/"

Then resolve the former as relative to the latter:

"http://example.com/foobar/"

This this is per section 5.3.2.4. of RFC3987:
> The complete path segments "." and ".." are intended only for use  
> within relative references (section 4.1 of [RFC3986]) and are  
> removed as part of the reference resolution process (section 5.2 of  
> [RFC3986]). However, some implementations may incorrectly assume  
> that reference resolution is not necessary when the reference is  
> already an IRI, and thus fail to remove dot-segments when they occur  
> in non-relative paths. IRI normalizers should remove dot-segments by  
> applying the remove_dot_segments algorithm to the path, as described  
> in section 5.2.4 of [RFC3986].

As ".." is an IRI, it can be normalized, which results in "". This is  
obviously problematic. Should path segment normalization only be done  
when there is a scheme and/or authority?


--
Geoffrey Sneddon
<http://gsnedders.com/>
Received on Tuesday, 9 September 2008 17:03:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:54 GMT