GRDDL Test Cases

Abstract

This document describes and includes test cases for software agents that extract RDF from XML source documents by following the set of mechanisms outlined in the Gleaning Resource Description from Dialects of Language (GRDDL) specification. They demonstrate the expected behavior of a GRDDL-aware agent by specifying one (or more) RDF graph serializations which are the GRDDL results associated with a single source document.

Introduction
Deliverables
Test Manifest Format
Using the Test Driver
EARL Reporting
Protocol Tracing
Local Policies, Faithful Rendition, and Conformance
Normative Tests
Informative Tests
References
Acknowledgements

Introduction

The tests are meant to satisfy the requirement for test cases covering all GRDDL features and library transformations as outlined in the GRDDL Working Group charter. They should be used for testing the conformance of GRDDL-aware agents. The normative tests cover the required behavior expected of a GRDDL-aware agent. The informative tests demonstrate expected behavior with respect to the issues resolved by the Working Group as well as other tests of robustness for software agents which consume source documents within a Semantic Web and generate GRDDL results. This document itself has (as a GRDDL result) a manifest document describing the test cases in RDF.

Deliverables

The deliverables included as part of the test case collection are:

A recommendation track document which normatively includes the tests (@@ Still an open issue?)
A zip archive including:
- The input and output(s) for each test
- An RDF/XML serialization of the test manifest
- The test driver
A manifest RDF/XML document describing the collection of tests
A driver for use with testing a particular implementation

Test Manifest Format

This test collection uses an RDF vocabulary for manifests developed for the RDF Test Cases Recommendation. A GRDDL-aware agent can extract the test collection and automatically test compliance by attempting to reproduce the expected GRDDL result(s) associated with each test case.

Using the Test Driver

We provide testft.py, a test driver, based on rdflib 2.3.3 and 4suite, specifically 4Suite-XML-1.0.tar.gz. Run it a la:

$ python testft.py --run your_grddl_impl testlist1.rdf >earl_out.rdf All tests were passed!

@@ NOTE: testft.py currently only depends on rdflib 2.3.3

It has options for --debug and such; invoke it with no arguments (or with --help) for details:

Options:
  -r, --run              path to a GRDDL implementation to use to process the 
                         source document (checking results)
  -u, --update           path to a GRDDL Implementation to use to process the 
                         source document
      --tester           The URI of an agent associated with the EARL test assertions.
                         A BNode is used if none is given                          
      --project          The URI of the EARL 'subject' (the implementation being tested).
                         A BNode is used if none is given

EARL Reporting

In addition to writing various diagnostic messages to STDERR, the test harness writes additional RDF data to STDOUT: an EARL assertion about each test it runs.

To tell it about the person running the tests and the software project being tested, point it to a tester (a URI in a FOAF RDF graph) and a test subject (a URI in a DOAP RDF graph).

Protocol Tracing

We find TCPWatch useful for debugging HTTP protocol interactions. If you start TCPWatch like so:

$ python tcpwatch.py -p 6543 &

then you can use it as a proxy:

$ http_proxy=http://127.0.0.1:6543 python testft.py --run your_grddl_impl testharness.rdf

Local Policies, Faithful Rendition, and Conformance

The GRDDL specification states that any transformation identified by an author of a GRDDL source document will provide a Faithful Rendition of the information expressed in the source document. The specification also grants a GRDDL-aware agent the license to makes a determination of whether or not to apply a particular transformation guided by user interaction, a local security policy, or the agent's capabilities. However, for the purpose of running these tests in order to determine compliance, a GRDDL-aware agent with a security policy which does not prevent it from applying transformations identified by each test will produce the GRDDL result associated with each test.

Tests with Multiple GRDDL Results

Certain tests have multiple GRDDL results as a direct consequence of Faithful Infoset considerations, information resources with multiple representations, and seperate GRDDL mechanisms which produce distinct GRDDL results. For such tests, A GRDDL-aware agent should output at least one of the GRDDL results associated with the test case.

Normative Tests

These tests address the functional requirements of the various ways in which GRDDL mandates the extraction of a GRDDL result.

input, output

@@Broken WRT #issue-mt-ns (base case rule applies)

input, output (@@something is funky with the fax field)

Ambiguos Infosets and Representations

These tests help check for robustness of implementations in the face of various odd cases.

a loop in the namespace names input output
@@ACK: thanks Gokhan Soydan
xslt_literal_result input output is empty; is that right??@@
issue-mt-ns
Testing GRDDL when XInclude processing is enabled input output
@@Does this work on Xalan and Saxon??@@

In this test case, the input file uses XInclude to include xinclude2.xml, and that the output has only one triple unless the XML Processor of the GRDDL implementation implements XInclude. The output for this case assumes that the processor does resolve XIncludes. Note, however, that this test case subsumes the XInclude disabled test case, which assumes that the GRDDL implementation has disabled XInclude processing.

See also issue-mt-ns.
Testing GRDDL when XInclude processing is disabled input output
This test case is an alternative to the XInclude enabled test case. The output for this case assumes that the processor does not resolve XIncludes, which may lead to a different GRDDL result.
Testing GRDDL attributes on RDF documents (1 of 3) input output
Note that the input is a RDF document with a GRDDL transformation, and that according to the rules given by the GRDDL Specification, there are three distinct and equally valid output graphs for this test for this document. An implementation only has to produce one of these three. This output is the result of the transformation without merging it with the graph of the source document.

issue-mt-ns
Testing GRDDL attributes on RDF documents (2 of 3) input output
See the explanation of having three valid outputs for this test case. This output is a graph that is identical with the graph given by the input document.

issue-mt-ns
Testing GRDDL attributes on RDF documents (3 of 3) input output
See the explanation of having three valid outputs for this test case. This output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation.

issue-mt-ns
Testing GRDDL attributes on RDF documents with XML media type(1 of 3) input output
See the explanation of having three valid outputs for this test case. This differs from that test case in that the RDF file is served (not best practice, but rather common) as media-type "application/xml". The output is the result of the transformation without merging it with the graph of the source document.

issue-mt-ns
Testing GRDDL attributes on RDF documents with XML media type(2 of 3) input output
See the explanation of having three valid outputs for this test case. This differs from that test case in that the RDF file is served (not best practice, but rather common) as media-type "application/xml". The output is a graph that is identical with the graph given by the input document.

issue-mt-ns
Testing GRDDL attributes on RDF documents with XML media type(3 of 3) input output
See the explanation of having three valid outputs for this test case. This differs from that test case in that the RDF file is served (not best practice, but rather common) as media-type "application/xml". The output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation

issue-mt-ns

Informative Tests

These tests cover features not mandated explicitely by the GRDDL specification, but demonstrate behavior expected of a GRDDL-aware agent in the context of Web Architecture best practices (@@Appropriate [WEBARCH] link?). They also cover behavior suggested by the Working Group as a result of resolving certain issues.

@@TODO: How to reconcile current (approved but informative) status of the Atom / Turtle test? - There was appreciable consensus on mentioning RDF graphs not RDF/XML documents

#atomttl1: re issue issue-output-formats yes, transformations may produce serializations other than RDF/XML; see 26 Nov from Danny and Henry input, output.
Note the transformation, atom2turtle_xslt-1.0.xsl, gives an RDF graph using turtle rather than RDF/XML. This test uses the text/rdf+n3 media type, which should appear in the IANA list of text media types in due course. See also the SPARQL CR request of Apr 2006.

See also: Atom/RDF in progress Aug 2006 by David Powell.

APPROVED in 24 Jan discussion of #issue-output-formats
Content Negotiation with GRDDL (1 of 2) input output
This test exists to bring attention to developers to issues of content negotiation, in particular, content negotiation over language as described and implemented by W3C QA. There are two valid resulting GRDDL results of running this GRDDL transformation depending on what language the GRDDL-aware agent uses, and an implementation of a GRDDL-aware agent only needs to retrieve the one that is appropriate for its HTTP header request. This result follows from retrieving a English version of the HTML representation and thus having the GRDDL result produce a result with English-language content.

issue-mt-ns
Content Negotiation with GRDDL (2 of 2) input output
See the explanation of having two valid outputs for this test case. This result follows from retrieving a German version of the HTML representation and thus having the GRDDL result produce a result with German-language content.

issue-mt-ns
#httpHeaders: test the use http headers to define a transform by jjc, uploaded by bwm Feb 19, 2007. Note that a .htaccess file is used to generate the required headers.
input, output
@@What is the status of this scenario?

References

@@Need references

GRDDL Test Cases

Editor's Draft March 13 2007

Abstract

Table of Contents