testing URL decomposition (related to ISSUE-56)


I just wrote a few test cases testing the behavior of the URL 
decomposition attributes (see 
in order to observe how UAs implement the algorithms specified in the 
currently in-transition "WEBADDRESSES" spec (currently referenced from 
HTML5: <http://www.w3.org/html/wg/href/draft>).

The short answer is: they apparently don't, and I didn't even start to 
write nasty test cases. So the behavior described in WEBADDRESSES might 
be required for interoperability in some other areas, but certainly 
*not* for the behavior of the decomposition attributes.

The test cases are here: 
<http://greenbytes.de/tech/webdav/urldecomp.html>, generated from 
<http://greenbytes.de/tech/webdav/urldecomp.xml> using 
<http://greenbytes.de/tech/webdav/urldecomp.xslt>. As you can see, I 
prefer XSLT over JS.

I realize that some of these tests may be incorrect; corrections and 
additions are welcome.

The main differences that I see:

- some UAs unespace percent-encoded characters, some don't

- port defaulting varies

- prefixing of path with "/" varies

- some UAs choke (throw exceptions) on certain malformed URIs

- fragment unescaping varies

- treatment of non-ASCII in the authority component varies

I don't see any kind of interoperability here, even not for the simplest 
test cases.

Maybe it's time to deprecate this mess (does anybody use this?), and 
define a sane URI/IRI library instead?

Best regards, Julian

Received on Wednesday, 10 February 2010 13:50:13 UTC