R2RML-based RDF mapping experiment in our Github repo

short version:

R2RML-based mapping experiment here -
https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/


longer version:

With a lot of help from Anastasia Dimou (cc:'d) I've been looking
again at an R2RML-based approach to RDF mappings and templates. See
her mail back in February for an overview of the approach Anastasia
and colleagues have been exploring:
http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0132.html

Essentially they have taken W3C R2RML and generalized it to
accommodate non-SQL input sources such as XML, JSON and ... CSV.

"RML is defined as a superset of the W3C-standardized mapping
language, R2RML, that maps data in relational databases to the RDF
data model"

They have a java implementation, "RMLProcessor" which can take a
mapping file and generate an RDF graph as output. For now I have
simply made a standalone version of an example that's bundled with
that tool. The file "moon.sh" encapsulates all the configuration you
need to run it on a unix or osx commandline. I've also copied sample
output into the repo.

Natural next steps would be to try this tooling with examples from our
use cases (http://www.w3.org/TR/csvw-ucr/), many of which are in
https://github.com/w3c/csvw/tree/gh-pages/examples/use-case-data-files

Currently in this WG we have been more focussed on defining our direct
mapping, and have looked also into super-simple mustache-based text
templating with basic variable substititions. However I believe we
also agreed recently that it should be possible to use our basic
approach (JSON-LD metadata etc., column names etc.) in a way that
describes how to apply more advanced templating mapping techniques.
I'd like to use this RML/R2RML experiment to make sure that we can
achieve this. If someone goes to the trouble to write RML/R2ML
mappings from CSVs to RDF, we really should have a clear way to
express that in W3C CSV metadata.

Discussing this today with Anastasia we came back to a point that Phil
Archer made yesterday on the call - which is that different parties
should be able to write different mappings for different consuming
applications. To follow through on that, perhaps a potential
requirement is that our metadata should allow several templates to be
given, each with inline information describing what they do.

For example, a CSV of music events could have one mapping file that
creates RDF for Google in schema.org (see
https://support.google.com/webmasters/answer/4620133?hl=en), and
another (with different target graph structures / vocabulary) for
consumption by tools like
http://simile-widgets.org/wiki/Reference_Documentation_for_Timeline

But for now I'd be happy to have some more complete (R2RML/RML)
mapping examples in our repo that work with CSVs from our use cases.
If you can get RMLProcessor working, I believe the file moon.sh below
should give you everything you need to get started.

cheers,

Dan



Files:
1. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/museum-model.rml.ttl

the mapping file.

2.
https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/moon-walkers.csv
Simple CSV input.

3. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/_output.nt.txt
The output (in N-Triples)

4.
https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/moon.sh

Fiddly details :)

This script is configured with the filetree path to the working
folder, and can be run to generate RDF output given the above inputs.


Background links:

See http://rml.io/RML_R2RML.html for details, and http://rml.io/RML_R2RML.html
http://rml.io/RML_publications.html
http://events.linkeddata.org/ldow2014/papers/ldow2014_paper_01.pdf

Received on Thursday, 2 October 2014 15:44:38 UTC