DCMI Metadata Terms: minor fixes needed for XSL scripts and build.xml

Gregg, Jon, all,

On Fri, May 18, 2012 at 01:14:44PM -0400, Gregg Kellogg wrote:
> The following SPARQL will output just subjects in dcterms from the
> RDFa document. (Note that the raw URI won't work, as it returns
> text/plain rather than text/html, so it tries to parse it with
> N-Triples).
>
> PREFIX dcterms: <http://purl.org/dc/terms/>
> CONSTRUCT {?s ?p ?o}
> FROM <http://rdf.greggkellogg.net/dcterms.html>
> WHERE {
>   ?s ?p ?o
>   FILTER (regex(str(?s), str(dcterms:)))
> }
>
> You could define different queries for each subset of the data you'd
> like, and add boilerplate to the CONSTRUCT clause as necessary to add
> any other vocabulary-specific triples.
>
> To get nicer prefix definitions out, you'd need to re-process, the
> tool doesn't currently pass the prefix definitions from the SPARQL
> query to the serializer, but I could probably get that working. Adding
> the prefix definitions in and re-parsing would work, though. Another
> SPARQL service may do this for you.

This sounds doable, but for now it might be easier simply to patch the
four scripts hitherto used to generate the four RDF schemas.

I have restored the following files to their state just before they
were deleted circa May 21:

    - 2012-05-17  22:50    211  2012-05-21/headers/header-rdf-dcam.xml
    - 2012-05-17  22:50    207  2012-05-21/headers/header-rdf-dcelements.xml
    - 2012-05-17  22:50    189  2012-05-21/headers/header-rdf-dcterms.xml
    - 2012-05-17  22:50    182  2012-05-21/headers/header-rdf-dctype.xml
    - 2012-05-17  22:49  10090  web/xsl/common-templates.xsl
    - 2012-05-17  22:49   2613  web/xsl/dcam.xsl
    - 2012-05-17  22:49   1797  web/xsl/dcelements.xsl
    - 2012-05-17  22:49   3608  web/xsl/dcterms.xsl
    - 2012-05-17  22:49   1414  web/xsl/dctype.xsl

The repository [3] is up-to-date with these recent changes.
Then I compared:

-- Ntriples generated from the RDFa/HTML document [1]
-- Ntriples generated from the RDF/XML produced by the scripts above (e.g., [2])

and find only two things that would need to tweaked to make the XSL scripts produce
identical triples:

-- Datatypes for dates
   s/"2008-01-14"/"2008-01-14"^^<http://www.w3.org/2001/XMLSchema#date>/

-- Language tags for text (e.g., description, label, skos:note, title...)
   s/@en-us/@en/

I can then edit header-rdf-* and the scripts should generate four RDF schemas
with triples identical to triples distilled from [1].

Gregg, are there perhaps changes to build.xml (to remove generation of the 
RDF schemas) that would need to patched back in?

How easy would it be to make these fixes?

Tom

[1] https://raw.github.com/dublincore/website/master/build/html/dcmi-terms/index.shtml
[2] http://dublincore.org/2010/10/11/dcterms.rdf
[3] https://github.com/dublincore/website/

--
Tom Baker <tom@tombaker.org>

Received on Friday, 18 May 2012 18:28:31 UTC