link test suite

In the past couple of weeks I've been working on a “link test suite”, a
test suite for our link checker, which I think should be easy to extend
to other link checkers and a number of other UAs (spiders, etc) meant to
parse markup, follow links and interact with HTTP servers.

I had a few long term ideas when I started, which I will detail at some
later point. For now, all I wanted was to give the link checker a simple
test suite to see what worked and what still needed work, and teach
myself unit testing in the process.

Sadly, in years of coding in perl, I never managed to wrap my head
around the concepts of Test::Builder and Test::More. It could be me, it
could the the documentation. I'd never managed to quite use the perl
testing model efficiently. With pyunit and with Karl (thank you!) to
help me at the beginning, it took me a couple of hours to "get" it and
start organizing my code to be unit-testable. There's actually a simple 
recipe...

For each class:
* build a skeleton of each method
* "think up" each method as simple and short as possible
* come up with a few inputs and expected outputs for the method, 
  ... put these as assertions in some def test_foo(): code
* write actual code
… pyunit takes care of the rest. cool.

In this case it was a little complicated at times, because writing unit
tests for a test harness to test a test tool (the link checker) has a
few too many levels of abstraction, but I think I got out of it not much
more insane than before.

I've started putting all this in CVS:
http://dev.w3.org/cvsweb/2008/link-testsuite/
(with instructions for download)

At the root are a number of html docs and php scripts used in the test case. 
There will be an index as soon as I find a python templating language I like.

The interesting stuff is in the harness directory, however. That's where
the actual test case metadata is, and the python harness to run the test
suite against the link checker.

public/2008/link-testsuite/harness% ./linktest.py 
linktest.py: 

Run or Generate test suite for link checkers

Usage: linktest.py [options] [run|sanity|doc]
    Options: 
        -h, --help: this manual you are reading
        -v, --verbose: verbose output
        -q, --quiet: suppress all output except errors
    
    Modes:
        run: run the link checker test suite 
        sanity: check that this code is still working 
            useful after using test cases or modifying code
        doc: generate an HTML index of the test cases

  for help use --help


As I said earlier, the code holds its own sanity check:

public/2008/link-testsuite/harness% ./linktest.py -v sanity
Test initialization of a default W3CLinkCheckerClient Object ... ok
Check whether our link checker can be contacted ... ok
Check whether our link checker can be asked to check a simple page ... ok
Check whether ElementTree parses basic XML and XPath support is high enough ... ok
Parse a Controlled link checker response ... ok
Test initialization of a default LinkTestCase Object ... ok
Opening and parsing a Test Case file (basic) ... ok
Opening and parsing a Test Case file (values) ... ok
Ill-formed test case files should throw an exception ... ok
Opening and parsing a Test Collection metadata file (basic) ... ok
Opening and parsing a Test Collection metadata file (values) ... ok
Test building a Test Collection ... ok
Test building the whole Link Checker Test Suite ... ok
Ran 13 tests in 3.470s
OK


I've started adding test cases, but this is definitely the tedious part. 
For now I've finished tests for all HTTP code statuses, some base
href/content-location tests. I've also started adding tests for all
element/attributes pairs of the HTML 4.01 spec, for which the attribute
type is %URI. Albeit awfully tedious to do by hand, I've already found a
few bugs in our link checker as a result.

ot@hae:~/Sites/cvs/public/2008/link-testsuite/harness% ./linktest.py -q run
======================================================================
FAIL: test base URI with relative BASE href (forbidden)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ot/Sites/cvs/public/2008/linkchecktests/harness/lib/LinkTestCase.py", line 52, in run_testcase
    self.assertEqual(self.checker.parse_checklink(self.checker.call_checklink(self.docURI)), self.expectResults)
AssertionError: {'400': '/trap/http.php?code=403'} != {'400': './trap/http.php?code=403'}

======================================================================
FAIL: test dereference BASE href (404)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ot/Sites/cvs/public/2008/linkchecktests/harness/lib/LinkTestCase.py", line 52, in run_testcase
    self.assertEqual(self.checker.parse_checklink(self.checker.call_checklink(self.docURI)), self.expectResults)
AssertionError: {} != {'404': 'http://qa-dev.w3.org/link-testsuite/thisURIdoesnotexist/'}

======================================================================
FAIL: test frame longdesc (404)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ot/Sites/cvs/public/2008/linkchecktests/harness/lib/LinkTestCase.py", line 52, in run_testcase
    self.assertEqual(self.checker.parse_checklink(self.checker.call_checklink(self.docURI)), self.expectResults)
AssertionError: {} != {'404': 'http://qa-dev.w3.org/link-testsuite/http.php?code=404'}

======================================================================
FAIL: test iframe longdesc (404)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ot/Sites/cvs/public/2008/linkchecktests/harness/lib/LinkTestCase.py", line 52, in run_testcase
    self.assertEqual(self.checker.parse_checklink(self.checker.call_checklink(self.docURI)), self.expectResults)
AssertionError: {} != {'404': 'http://qa-dev.w3.org/link-testsuite/http.php?code=404'}

======================================================================
FAIL: test reporting HTTP 404 (DNS error)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ot/Sites/cvs/public/2008/linkchecktests/harness/lib/LinkTestCase.py", line 52, in run_testcase
    self.assertEqual(self.checker.parse_checklink(self.checker.call_checklink(self.docURI)), self.expectResults)
AssertionError: {'-2': 'http://doesnotexist.w3.org/'} != {'404': 'http://doesnotexist.w3.org/'}

----------------------------------------------------------------------
Ran 35 tests in 88.036s


I'd like to find a way to fine tune (or extend) pyunit to give nicer
info on why an assertion has failed, e.g in the case of the relative
BASE URI the issue is that the link checker constructs a reported bogus
URI different than the one I am expecting, and for the HTTP 404 (DNS
error) it's because the "code" for the error is -2 (dns resolution
failed) while I was expecting a 404, but still, I found out that our
checker doesn't test everything it could test.

That's already something.

I'll keep adding test cases, if anyone wants to join, the list of tests
that still need to be written is within:
http://dev.w3.org/cvsweb/2008/link-testsuite/README


-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier/
W3C Open Source Software: http://www.w3.org/Status

Received on Monday, 28 January 2008 07:00:23 UTC