How to Validate Test Sample Metadata (TCDL 2.0) with ISO Schematron

Editor: Christophe Strobbe (DocArch, Katholieke Universiteit Leuven).

Introduction

The Test Samples Development Task Force (TSDTF) uses a subset of TCDL 2.0 as metadata format. In addition to the constraints defined by the TCDL 2.0 Specification and Guide and W3C Schema for TCDL 2.0, the task force also uses a number of additional constraints, which are set out in WCAG 2.0 Test Samples Metadata and in Checklist for Structure Review. The ISO Schematron for the Test Sample Metadata is intended to check most of the constraints that are specific to the task force. The following paragraphs describe how to install the necessary files and how to run the validation process for the ISO Schematron files.

Validating with Saxon and the XSLT Implementation of ISO Schematron

Getting and Installing Java

This how-to guide assumes that you use Java to run Saxon. There is also a Saxon version for the .NET platform, but that is out of scope for this guide.

The Prerequisites for Saxon state that you need at least a Java VM (also known as JRE) to run Saxon. You can also use a Java Development Kit (JDK). You need at least version 1.4.

If you don't know if a JRE or JDK is available on your computer, you can open a command-line environment and type java -version. The output will tell you if you have Java Runtime Environment and which version you have installed. For example, for Java 6, you may get:
java version "1.6.0_01"
Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
Java HotSpot(TM) Client VM (build 1.6.0_01-b06, mixed mode, sharing)

To get Java, go to Java SE downloads and download the JRE or JDK that you prefer (you don't need NetBeans to run ISO Schematron). On the same page, you will also find links to installation instructions. Basically, you run the installer and (on Windows) update the PATH variable. The installation instructions explain you how to do this. When you're ready, you should be able to check your Java version at the command line as explained above.

Getting and Installing JAXP

If you use Java 1.4, you also need to install the Java API for XML Processing (JAXP). If you use Java 1.5 or 1.6 (also known as Java 5 and Java 6, respectively), you can skip this step.

You can downnload the JAXP reference implementation from the Project GlassFish website. After downloading the class file, you open a command-line window, navigate to the folder where you downloaded the class file, and run java -cp . JAXP_RI_20060217 (the last part corresponds to the name of the class file without the extension). This should unpack JAXP. After that, you need to put the files jaxp-api.jar and dom.jar on the classpath.

To check that the jar files are available, open a command-line window, enter echo %classpath% and check that the jar files are available in the list that is displayed.

Getting and Installing Saxon

To use the XSLT implementation of ISO Schematron, you need an XSLT processor. Saxon 8.9 is the most recent version of an open-source XSLT and XQuery processor. The free version is available at http://saxon.sourceforge.net/#F8.9B. (There is also a commercial version with support for XML Schema at http://saxon.sourceforge.net/#F8.9SA.)

To install Saxon, you download a zip file from the address above and unpack it into a suitable folder. After that, you need to put saxon8.jar on the classpath. (More details are available on the Saxonica website.)

Getting and “Installing” the XSLT Implementation of ISO Schematron

The XSLT of ISO Schematron consists of a skeleton file and a choice of customization files. These files can all be downloaded from ISO Schematron Validators built on the “Skeleton”. The skeleton file is still in beta. There are currently two versions: one from 8 February 2007 (direct link: iso_schematron_skeleton.xsl) and another from 19 July 2007 (direct link: new_schematron_skeleton.xsl).

In addition to the skeleton file, you also need a XSLT that customizes the skeleton. In this guide, we'll use the XSLT that outputs Schematron Validation Report Language (SVRL).

Put iso_schematron_skeleton.xsl (or new_schematron_skeleton.xsl) and iso_svrl.xsl in the same folder.

Getting and Using the ISO Schematron for the Test Sample Metadata

The ISO Schematron for the Test Sample Metadata is currently only available in the mailing list archive (28 August 2007). Download and install this file into an appropriate directory.

Using this ISO Schematron file is a two-step process: first, transforming the ISO Schematron file into an XSLT file, then, processing the TCDL file with the generated XSLT file. All of this is done at the command line.

To transform ISO Schematron file into an XSLT file, type
java net.sf.saxon.Transform -o tcdl2.0.tsdtf.20070828.sch.tmp.xsl -s tcdl2.0.tsdtf.20070828.sch iso_svrl.xsl.
This creates the file tcdl2.0.tsdtf.20070828.sch.tmp.xsl in the same folder.

To validate a TCDL file (here scx.x.x_lx_xxx.xml) with the generated XSLT file, type
java net.sf.saxon.Transform -o scx.x.x_lx_xxx.xml.report.xml -s scx.x.x_lx_xxx.xml tcdl2.0.tsdtf.20070828.sch.tmp.xsl.
This creates an SVRL file called scx.x.x_lx_xxx.xml.report.xml.

To save some typing, the two steps can be combined in a single batch file:

@echo off
cls
echo Usage: schematron %%1 = iso schematron file, with its extension.
echo %%2 is the input xml file, with the extension.
echo E.g. schematron input.sch input.xml will produce input.report.xml as output

del %1.tmp.xsl
echo Generate the stylesheet from %1

java net.sf.saxon.Transform -o %1.tmp.xsl -s %1 iso_svrl.xsl

echo Now run the input file %2 against the generated stylesheet %1.tmp.xsl to produce %2.report.xml

java net.sf.saxon.Transform -o %2.report.xml -s %2 %1.tmp.xsl

If you save the above code as schematron.bat, you can run it with the following instruction on the command line: C:\schematron tcdl2.0.tsdtf.20070828.sch scx.x.x_lx_xxx.xml.
Note: you need to adapt the file and the command-line instruction to make them reflect the location of your TCDL, XSLT and ISO Schematron files.