Re: student project idea: RDF/RDFa parser QA via automatic test-suite generation

Manu Sporny wrote:
> Dan Brickley wrote:
>> I'd like to see an auto-generated repository of RDFa samples, most (but
>> not all) of which are decent wellformed XHTML with RDFa, but also with a
>> good number of poorly-marked up files. 
> 
> +1 - sounds like a worthwhile project.

Thanks for the sanity check

> There are two permutations of this approach:
> 
> The first involves generating valid and invalid XHTML+RDFa to see if the
> parsers can make it through the file. Did the parser dump core or did it
> exit with a good status code?

Yup. Thankfully there aren't many parsers in languages that core dump, 
but we ought to put the precious few we have through such an exercise.

> The second involves generating valid XHTML+RDFa as well as the
> corresponding SPARQL files such that they can be hooked up to the RDFa
> Test harness. Did the parser exit with a good status code AND did the
> SPARQL evaluate to TRUE?
> 
>> Generating such a test set and then wiring it up to a set of RDFa
>> parsers (via http://rdfa.digitalbazaar.com/rdfa-test-harness/ or
>> something like it) shouldn't be a huge job
> 
> It would be fairly straight-forward to do this - the RDFa Test Harness
> is already setup for use-cases like what you are describing. We would need:
> 
> A manifest file[1], and a set of matching RDFa+XHTML files and their
> corresponding SPARQL files[2].

Great! I wonder how we could go from 1000 random RDFa files to have the 
appropriate SPARQL tests too? Perhaps if 3+ parsers agreed on the output?

>> (c) whether the spec gurus agree on what ought to be generated.
> 
> I don't suggest getting the spec gurus involved in most of the 1000 test
> cases. On the RDFa telecons, it takes us roughly 5-10 minutes to get
> through the simple, straight-forward test cases... and that's after
> we've reviewed them offline. I'd lean on the spec gurus only when there
> is a disagreement between the parser writers on what should happen.

Agreed. Ideally it should be obvious to everyone after a careful reading 
of the relevant specs. Playing Ask The Guru should be a matter of last 
resort :)

> This would be a great summer project for a student. I'd be willing to
> lend advice and help integrating with the RDFa Test Harness.

That would be fantastic. I hope we don't have to wait until next 
(Northern-hemisphere) summer. Hope we can track down some volunteers 
before then.

Hmm, is there a 'student projects for the Semantic Web' page anywhere?

Searching around  I find http://www.hpl.hp.com/semweb/student-work.htm 
(2007), 
http://www.ilrt.bris.ac.uk/discovery/2003/02/student-projects/weblog-recommender.html
http://www.ilrt.bristol.ac.uk/discovery/2003/01/student-projects/index.html 
(2003 :) http://www.cse.lehigh.edu/~heflin/courses/sw-fall01/ (2001!) 
and... for MIT students and seemingly current, 
http://www.w3.org/People/Berners-Lee/Research.html  ... there are also a 
few Google Summer of Code pages scattered around, 
http://semanticweb.deit.univpm.it/tiki-index.php?page=ProjectProposalPage

If there's interest, maybe we can have a updated wiki index of student 
project ideas?

cheers,

Dan


> -- manu
> 
> [1]http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/rdfa-xhtml1-test-manifest.rdf
> [2]http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/Test0001
> 

Received on Tuesday, 18 November 2008 15:52:14 UTC