W3C home > Mailing lists > Public > public-html@w3.org > February 2010

testing URL decomposition (related to ISSUE-56)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 10 Feb 2010 14:49:33 +0100
Message-ID: <4B72B96D.7020402@gmx.de>
To: "public-html@w3.org" <public-html@w3.org>
Hi,

I just wrote a few test cases testing the behavior of the URL 
decomposition attributes (see 
<http://dev.w3.org/html5/spec/infrastructure.html#interfaces-for-url-manipulation>), 
in order to observe how UAs implement the algorithms specified in the 
currently in-transition "WEBADDRESSES" spec (currently referenced from 
HTML5: <http://www.w3.org/html/wg/href/draft>).

The short answer is: they apparently don't, and I didn't even start to 
write nasty test cases. So the behavior described in WEBADDRESSES might 
be required for interoperability in some other areas, but certainly 
*not* for the behavior of the decomposition attributes.

The test cases are here: 
<http://greenbytes.de/tech/webdav/urldecomp.html>, generated from 
<http://greenbytes.de/tech/webdav/urldecomp.xml> using 
<http://greenbytes.de/tech/webdav/urldecomp.xslt>. As you can see, I 
prefer XSLT over JS.

I realize that some of these tests may be incorrect; corrections and 
additions are welcome.

The main differences that I see:

- some UAs unespace percent-encoded characters, some don't

- port defaulting varies

- prefixing of path with "/" varies

- some UAs choke (throw exceptions) on certain malformed URIs

- fragment unescaping varies

- treatment of non-ASCII in the authority component varies

I don't see any kind of interoperability here, even not for the simplest 
test cases.

Maybe it's time to deprecate this mess (does anybody use this?), and 
define a sane URI/IRI library instead?

Best regards, Julian
Received on Wednesday, 10 February 2010 13:50:13 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:14 UTC