Re: update on URL testing and the W3C's test framework

On Wed, 07 Nov 2012 04:27:32 +0100, Chris Weber <chris@lookout.net> wrote:

> Hello, I wanted to share some of the progress that's being made in URL
> testing.  About 600 test cases are available in JSON format here:
>
> https://raw.github.com/cweb/url-testing/master/urls.json
>
> Most of these have been collected from Webkit as referenced in the JSON
> file. While usable, there's more work required before this could be
> called stable:
>
> - The "expected" results are currently based mostly on what Webkit
> expects, as a majority of these are taken from Webkit's test framework.
> - I plan to add a descriptive comment to each test case.
> - I plan to curate each test case more closely to catch any errors in
> the data.
>
> My main question to the list:  How should the "expected results" be
> determined for each test case?

I think we should either keep the expected results be the WebKit expected  
results for now and see where other implementations disagree with WebKit,  
or change the expected result to match Anne's URL spec and see where  
implementations disagree with the spec. In the end, I think the expected  
results should match Anne's URL spec.

Note that the existing RFCs don't define an expected result for invalid  
URLs, which are being tested.

> I've also started building a test page, which is in a very rough
> work-in-progress stage, located at http://www.lookout.net/test/url/.
> It's based from Simon Pieters and the W3C test framework.  See
> http://simon.html5.org/test/url/relative-resolution.html and
> http://darobin.github.com/test-harness-tutorial/docs/using-testharness.html
>
> More test cases and test groupings are always welcome.  For Web browser
> testing, further plans include using a server-side component to test
> equivalence between the DOM representation and what's sent on the wire
> in a GET request.

Very nice.

One way to generate many tests is to loop through a set of code points or  
escape sequences you are interested in (e.g. %00-%FF or all of BMP as raw  
characters) and inserting it in different places in a URL (scheme, user,  
password, host, port, path, query, fragment). It's also interesting to  
test characters in the host that have different rules between IDNA2003 and  
IDNA2008.

> Best regards,
> Chris

cheers
-- 
Simon Pieters
Opera Software

Received on Wednesday, 7 November 2012 10:50:27 UTC