Re: URI Test Suite

You remind me - I logged a bug with Python a while back regarding
their urlparse module; it represents a URI as a six-tuple

  ( scheme, authority, path, parameters, query, fragment )

As I read 2396, parameters can occur as a suffix to *each* path
segment, not only on the final one. Guido's response was "great! fix
it!", but I haven't had time to make a go of it.

URI list, is this reading correct? Any opinions as to what data
structure is best to represent a parsed URI with? (I'm thinking just
make it a 5-tuple, taking out 'parameters', and optionally making
'path' a list of 2-tuples, but that's probably not useful in a lot of
scenarios).




On Wed, Aug 08, 2001 at 04:15:20PM -0500, Aaron Swartz wrote:
> Below is a test suite for Python based on Appendix C of RFC2396. 
> Python's built-in urlparse module fails the test suite in a 
> number of instances. I used a similar test suite to validate my 
> Tcl URI parser.
> 
> Enjoy,
> --
> [ "Aaron Swartz" ; <mailto:me@aaronsw.com> ; <http://www.aaronsw.com/> ]
> 
> import urlparse
> 
> base = 'http://a/b/c/d;p?q'
> 
> assert urlparse.urljoin(base, 'g:h') == 'g:h'
> assert urlparse.urljoin(base, 'g') ==   'http://a/b/c/g'
> assert urlparse.urljoin(base, './g') == 'http://a/b/c/g'
> assert urlparse.urljoin(base, 'g/') ==  'http://a/b/c/g/'
> assert urlparse.urljoin(base, '/g') ==  'http://a/g'
> assert urlparse.urljoin(base, '//g') == 'http://g'
> assert urlparse.urljoin(base, '?y') ==  'http://a/b/c/?y'
> assert urlparse.urljoin(base, 'g?y') == 'http://a/b/c/g?y'
> assert urlparse.urljoin(base, '#s') ==  'http://a/b/c/d;p?q#s'
> assert urlparse.urljoin(base, 'g#s') == 'http://a/b/c/g#s'
> assert urlparse.urljoin(base, 'g?y#s') == 'http://a/b/c/g?y#s'
> assert urlparse.urljoin(base, ';x') == 'http://a/b/c/;x'
> assert urlparse.urljoin(base, 'g;x') ==  'http://a/b/c/g;x'
> assert urlparse.urljoin(base, 'g;x?y#s') == 'http://a/b/c/g;x?y#s'
> assert urlparse.urljoin(base, '.') ==  'http://a/b/c/'
> assert urlparse.urljoin(base, './') ==  'http://a/b/c/'
> assert urlparse.urljoin(base, '..') ==  'http://a/b/'
> assert urlparse.urljoin(base, '../') ==  'http://a/b/'
> assert urlparse.urljoin(base, '../g') ==  'http://a/b/g'
> assert urlparse.urljoin(base, '../..') ==  'http://a/'
> assert urlparse.urljoin(base, '../../') ==  'http://a/'
> assert urlparse.urljoin(base, '../../g') ==  'http://a/g'
> 
> assert urlparse.urljoin(base, '') == base
> 
> assert urlparse.urljoin(base, '../../../g')    ==  'http://a/../g'
> assert urlparse.urljoin(base, '../../../../g') ==  'http://a/../../g'
> 
> assert urlparse.urljoin(base, '/./g') ==  'http://a/./g'
> assert urlparse.urljoin(base, '/../g')         ==  'http://a/../g'
> assert urlparse.urljoin(base, 'g.')            ==  'http://a/b/c/g.'
> assert urlparse.urljoin(base, '.g')            ==  'http://a/b/c/.g'
> assert urlparse.urljoin(base, 'g..')           == 'http://a/b/c/g..'
> assert urlparse.urljoin(base, '..g')           == 'http://a/b/c/..g'
> 
> assert urlparse.urljoin(base, './../g')        ==  'http://a/b/g'
> assert urlparse.urljoin(base, './g/.')         ==  'http://a/b/c/g/'
> assert urlparse.urljoin(base, 'g/./h')         ==  'http://a/b/c/g/h'
> assert urlparse.urljoin(base, 'g/../h')        ==  'http://a/b/c/h'
> assert urlparse.urljoin(base, 'g;x=1/./y')     ==  
> 'http://a/b/c/g;x=1/y'
> assert urlparse.urljoin(base, 'g;x=1/../y')    ==  'http://a/b/c/y'
> 
> assert urlparse.urljoin(base, 'g?y/./x')       ==  
> 'http://a/b/c/g?y/./x'
> assert urlparse.urljoin(base, 'g?y/../x')      == 
> 'http://a/b/c/g?y/../x'
> assert urlparse.urljoin(base, 'g#s/./x')       ==  
> 'http://a/b/c/g#s/./x'
> assert urlparse.urljoin(base, 'g#s/../x')      ==  
> 'http://a/b/c/g#s/../x'
> 

-- 
Mark Nottingham, Research Scientist
Akamai Technologies (San Mateo, CA USA)

Received on Friday, 10 August 2001 23:43:48 UTC