W3C home > Mailing lists > Public > public-lod@w3.org > January 2010

Can anyone help with an XSLT GRDDL conversion of Open Packaging Format (OPF) into RDF/XML Dublin Core

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 28 Jan 2010 17:25:34 +0100
Message-ID: <eb19f3361001280825h6d58de9fxe3d2e8bc5acb6607@mail.gmail.com>
To: public-lod <public-lod@w3.org>, DCMI Architecture Forum <DC-ARCHITECTURE@jiscmail.ac.uk>
Cc: Makx Dekkers <mail@makxdekkers.com>, Andy Powell <andy.powell@eduserv.org.uk>, Ed Summers <ehs@pobox.com>
Hi all

http://www.idpf.org/2007/opf/OPF_2.0_final_spec.html#AppendixA defines
a Dublin Core-based XML metadata format used for ebooks.

This is very nice but a little disconnected from other Dublin Core
data in RDF. It would be great to have some XSLT to explore closer
integration and use of newer Dublin Core idioms (including
http://purl.org/dc/terms/).

Anyone got the time / expertise to explore this?

A related task would be to track down some actual OPF data to convert.
You don't need be an XSLT guru to do this :)

There's a forum at
http://www.idpf.org/forums/viewforum.php?f=5&sid=4b4d5b89baf1300bd0f258e0715610e5
with some pointers to data. For example,

""""I am pleased to announce that Adobe InDesign CS3 now supports the
direct generation of OCF-packaged OPS content. A sample generated
directly from InDesign CS3 can be found at:
http://www.idpf.org/2007/ops/samples/TwoYearsBeforeTheMast.epub"""

...which is a .zip package containing a file content.opf, the
beginning of which I'll excerpt below.

Thanks for any help exploring this. I found 3 examples in the forum,
the metadata section of the .opf files are extracted below. As we
think about RDFizing these, I think there are two aspects: firstly,
getting modern RDF triples from the data as-is. This might take some
care to figure out what role= should be, etc. But also secondly,
thinking how the format could be enriched in future iterations, so
that linked data URIs are used, eg. for those LCSH headings. At the
moment they have  <dc:subject>lcsh: Czech
Americans—Fiction.</dc:subject> but it would be nice if
http://id.loc.gov/authorities/sh2009122741#concept was in there
somewhere (instead, as well?).

I'm sure any help working through these practicalities would be
appreciated both by the OPF folk and by Dublin Core...

cheers,

Dan




example 1: http://www.idpf.org/2007/ops/samples/TwoYearsBeforeTheMast.epub

<?xml version="1.1"?>
<package xmlns="http://www.idpf.org/2007/opf" version="2.0"
unique-identifier="bookid">
  <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:title>Two Years Before the Mast</dc:title>
    <dc:creator>Richard H. Dana Jr.</dc:creator>
    <dc:subject>19th Century</dc:subject>
    <dc:subject>California</dc:subject>
    <dc:subject>Sailors' life</dc:subject>
    <dc:subject>fur trade</dc:subject>
    <dc:description>Two years at sea on the coast of California</dc:description>
    <dc:identifier
id="bookid">urn:uuid:4618c86c-f508-11db-8314-0800200c9a66</dc:identifier>
 </metadata>
  <manifest>
    <item id="ncx" href="toc.ncx" media-type="text/xml"/>
    <item id="introduction" href="Introduction.html"
media-type="application/xhtml+xml"/>
    <item id="chapteri" href="ChapterI.html"
media-type="application/xhtml+xml"/>
...



example 2: http://www.idpf.org/2007/ops/samples/hauy.epub

<package xmlns="http://www.idpf.org/2007/opf" version="2.0"
unique-identifier="uid">
        <metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:opf="http://www.idpf.org/2007/opf">
                <dc:title>Valentin Haüy - the father of the education
for the blind</dc:title>
                <dc:creator>Beatrice Christensen Sköld</dc:creator>
                <dc:publisher>TPB</dc:publisher>
                <dc:date opf:event="publication">2006-03-23</dc:date>
                <dc:date opf:event="creation">2007-08-09</dc:date>
                <dc:identifier id="uid">C00000</dc:identifier>
                <dc:language>en</dc:language>
                <meta name="generator" content="Daisy Pipeline OPS Creator" />
        </metadata>


example 3: http://www.idpf.org/2007/ops/samples/myantonia.epub

<package version="2.0"
         unique-identifier="PrimaryID"
         xmlns="http://www.idpf.org/2007/opf">

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:opf="http://www.idpf.org/2007/opf">
<dc:title>My Ántonia</dc:title>
<dc:identifier id="PrimaryID"
opf:scheme="URN">urn:uuid:14c77a9a-e849-11db-8314-0800200c9a66</dc:identifier>
<dc:language>en-US</dc:language>
<dc:creator opf:role="aut" opf:file-as="Cather, Willa Sibert">Willa
Cather</dc:creator>
<dc:creator opf:role="ill" opf:file-as="Benda, Wladyslaw Theodor">W.
T. Benda</dc:creator>
<dc:contributor opf:role="edt" opf:file-as="Noring, Jon E.">Jon E.
Noring</dc:contributor>
<dc:contributor opf:role="edt" opf:file-as="Menéndez, José">José
Menéndez</dc:contributor>
<dc:contributor opf:role="mdc" opf:file-as="Noring, Jon E.">Jon E.
Noring</dc:contributor>
<dc:contributor opf:role="trc" opf:file-as="Noring, Jon E.">Jon E.
Noring</dc:contributor>
<dc:publisher>DigitalPulp Publishing</dc:publisher>
<dc:description>My Ántonia is considered to be Willa S. Cather’s best
work, first published in 1918. It is a fictional account (inspired by
Cather’s childhood years) of the pioneer prairie settlers in late 19th
century Nebraska. This version, intended for general readers, is a
faithful, highly-proofed, and modestly modernized transcription of the
First Edition, with text corrections by José
Menéndez.</dc:description>
<dc:coverage>Nebraska prairie, late 19th and early 20th Centuries
C.E.</dc:coverage>
<dc:source>First Edition of My Ántonia, published by the Riverside
Press Cambridge, Houghton Mifflin Company, Boston and New York,
October 1918</dc:source>
<dc:date opf:event="original-publication">1918-10</dc:date>
<dc:date opf:event="ops-publication">2007-05-02</dc:date>
<dc:rights>The original text of My Ántonia is public domain. This OPS
2.0 Publication version, including the text corrections, is issued
under a Creative Commons Attribution
-ShareAlike 3.0 License (refer to
http://creativecommons.org/licenses/by-sa/3.0/ for license
details.)</dc:rights>
<dc:subject>lcsh: Women immigrants—Fiction.</dc:subject>
<dc:subject>lcsh: Farmers' spouses—Fiction.</dc:subject>
<dc:subject>lcsh: Czech Americans—Fiction.</dc:subject>
<dc:subject>lcsh: Women pioneers—Fiction.</dc:subject>
<dc:subject>lcsh: Married women—Fiction.</dc:subject>
<dc:subject>lcsh: Friendship—Fiction.</dc:subject>
<dc:subject>lcsh: Farm life—Fiction.</dc:subject>
<dc:subject>lcc: PS3505.A87</dc:subject>
</metadata>
...
Received on Thursday, 28 January 2010 16:26:09 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:24 UTC