W3C home > Mailing lists > Public > uri@w3.org > July 2007

Re: URI Reference creation

From: Sebastian Pipping <webmaster@hartwork.org>
Date: Mon, 30 Jul 2007 20:58:33 +0200
Message-ID: <46AE34D9.6040009@hartwork.org>
To: Mike Brown <mike@skew.org>, uri@w3.org

Mike Brown wrote:
> For an idea of how it can be done in Python, see the Relativize function in 
> http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?view=markup
> 
> Test cases are in 
> http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup. Scroll 
> down to where it says "Test cases for Relativize()". The first 2 values in 
> each tuple are the first 2 arguments to Relativize(), and the last 2 values 
> are the expected results when the 3rd argument is False or True, respectively.

---------------------------------------------------------------------------
I had a first closer look at the test suite and stumbled upon four
testcases that suprised me. They belong to normalization testing.
Why are these four URIs expected not to be changed by normalization?

   'example://A/b/c/%7bfoo%7d' --> 'example://A/b/c/%7bfoo%7d'

   'a/b/../../c' --> 'a/b/../../c'

   'a/b/././c' --> 'a/b/././c'

   'a/b/../c/././d' --> 'a/b/../c/././d'
---------------------------------------------------------------------------



> Pretty much, but you need to think in terms of path segments, not just the 
> whole URI string; otherwise you'll be tripped up by query and fragment 
> components.

---------------------------------------------------------------------------
Right. I guess I would have forgotten about that.
---------------------------------------------------------------------------



> We also do some special-casing to make sure the algorithm isn't tripped up by
> empty path segments.

---------------------------------------------------------------------------
Doesn't that change the "content" of URI?
Maybe it is used in a context where empty path segments must not be
stripped. Can that happen?
---------------------------------------------------------------------------



> Due credit: This functionality was added to 4Suite by John L. Clark. It's one 
> of the few parts of our URI library that wasn't written by me or Uche Ogbuji.
> If you base your code on it, just mention in comments that it's based on code
> from 4Suite XML 1.0.

---------------------------------------------------------------------------
Unfortunately I cannot currently base my code on it. You code is licensed
under the Apache license which is incompatiple with GPL and I guess LGPL
as well. The unit testing framework I am using is licensed under LPGL.

I was wondering if it is legal to use your "test data" but not your code?

But in general it seems that Pyhton code does RFC 3986 URI validation
which is still missing for the Online XSPF validator [1] written in Python,
licensed under LGPL. Thinking of that two cases of conflicts with the
Apache license: would you be willing to re-license the code to be LGPL
compatible?
---------------------------------------------------------------------------



>> Isn't ".COM" case-insensitive since it is part of the
>> host not path?
> 
> No, the entire email address is the 'path' component of the URI. The 
> 'authority' component (of which 'host' is a subcomponent) does not exist in 
> mailto URIs.

---------------------------------------------------------------------------
I see, thanks for pointing that out.



Sebastian
Received on Monday, 30 July 2007 18:59:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:37 GMT