Re: student project idea: RDF/RDFa parser QA via automatic test-suite generation

Manu Sporny wrote:
> Dan Brickley wrote:
>> I'd like to see an auto-generated repository of RDFa samples, most (but
>> not all) of which are decent wellformed XHTML with RDFa, but also with a
>> good number of poorly-marked up files. 
> +1 - sounds like a worthwhile project.

Thanks for the sanity check

> There are two permutations of this approach:
> The first involves generating valid and invalid XHTML+RDFa to see if the
> parsers can make it through the file. Did the parser dump core or did it
> exit with a good status code?

Yup. Thankfully there aren't many parsers in languages that core dump, 
but we ought to put the precious few we have through such an exercise.

> The second involves generating valid XHTML+RDFa as well as the
> corresponding SPARQL files such that they can be hooked up to the RDFa
> Test harness. Did the parser exit with a good status code AND did the
> SPARQL evaluate to TRUE?
>> Generating such a test set and then wiring it up to a set of RDFa
>> parsers (via or
>> something like it) shouldn't be a huge job
> It would be fairly straight-forward to do this - the RDFa Test Harness
> is already setup for use-cases like what you are describing. We would need:
> A manifest file[1], and a set of matching RDFa+XHTML files and their
> corresponding SPARQL files[2].

Great! I wonder how we could go from 1000 random RDFa files to have the 
appropriate SPARQL tests too? Perhaps if 3+ parsers agreed on the output?

>> (c) whether the spec gurus agree on what ought to be generated.
> I don't suggest getting the spec gurus involved in most of the 1000 test
> cases. On the RDFa telecons, it takes us roughly 5-10 minutes to get
> through the simple, straight-forward test cases... and that's after
> we've reviewed them offline. I'd lean on the spec gurus only when there
> is a disagreement between the parser writers on what should happen.

Agreed. Ideally it should be obvious to everyone after a careful reading 
of the relevant specs. Playing Ask The Guru should be a matter of last 
resort :)

> This would be a great summer project for a student. I'd be willing to
> lend advice and help integrating with the RDFa Test Harness.

That would be fantastic. I hope we don't have to wait until next 
(Northern-hemisphere) summer. Hope we can track down some volunteers 
before then.

Hmm, is there a 'student projects for the Semantic Web' page anywhere?

Searching around  I find 
(2003 :) (2001!) 
and... for MIT students and seemingly current,  ... there are also a 
few Google Summer of Code pages scattered around,

If there's interest, maybe we can have a updated wiki index of student 
project ideas?



> -- manu
> [1]
> [2]

Received on Tuesday, 18 November 2008 15:52:14 UTC