- From: Dan Brickley <danbri@google.com>
- Date: Thu, 2 Oct 2014 16:44:09 +0100
- To: "public-csv-wg@w3.org" <public-csv-wg@w3.org>, Anastasia Dimou <anastasia.dimou@ugent.be>
short version: R2RML-based mapping experiment here - https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/ longer version: With a lot of help from Anastasia Dimou (cc:'d) I've been looking again at an R2RML-based approach to RDF mappings and templates. See her mail back in February for an overview of the approach Anastasia and colleagues have been exploring: http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0132.html Essentially they have taken W3C R2RML and generalized it to accommodate non-SQL input sources such as XML, JSON and ... CSV. "RML is defined as a superset of the W3C-standardized mapping language, R2RML, that maps data in relational databases to the RDF data model" They have a java implementation, "RMLProcessor" which can take a mapping file and generate an RDF graph as output. For now I have simply made a standalone version of an example that's bundled with that tool. The file "moon.sh" encapsulates all the configuration you need to run it on a unix or osx commandline. I've also copied sample output into the repo. Natural next steps would be to try this tooling with examples from our use cases (http://www.w3.org/TR/csvw-ucr/), many of which are in https://github.com/w3c/csvw/tree/gh-pages/examples/use-case-data-files Currently in this WG we have been more focussed on defining our direct mapping, and have looked also into super-simple mustache-based text templating with basic variable substititions. However I believe we also agreed recently that it should be possible to use our basic approach (JSON-LD metadata etc., column names etc.) in a way that describes how to apply more advanced templating mapping techniques. I'd like to use this RML/R2RML experiment to make sure that we can achieve this. If someone goes to the trouble to write RML/R2ML mappings from CSVs to RDF, we really should have a clear way to express that in W3C CSV metadata. Discussing this today with Anastasia we came back to a point that Phil Archer made yesterday on the call - which is that different parties should be able to write different mappings for different consuming applications. To follow through on that, perhaps a potential requirement is that our metadata should allow several templates to be given, each with inline information describing what they do. For example, a CSV of music events could have one mapping file that creates RDF for Google in schema.org (see https://support.google.com/webmasters/answer/4620133?hl=en), and another (with different target graph structures / vocabulary) for consumption by tools like http://simile-widgets.org/wiki/Reference_Documentation_for_Timeline But for now I'd be happy to have some more complete (R2RML/RML) mapping examples in our repo that work with CSVs from our use cases. If you can get RMLProcessor working, I believe the file moon.sh below should give you everything you need to get started. cheers, Dan Files: 1. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/museum-model.rml.ttl the mapping file. 2. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/moon-walkers.csv Simple CSV input. 3. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/_output.nt.txt The output (in N-Triples) 4. https://github.com/w3c/csvw/blob/master/examples/tests/scenarios/moonwalkersdemo/moon.sh Fiddly details :) This script is configured with the filetree path to the working folder, and can be run to generate RDF output given the above inputs. Background links: See http://rml.io/RML_R2RML.html for details, and http://rml.io/RML_R2RML.html http://rml.io/RML_publications.html http://events.linkeddata.org/ldow2014/papers/ldow2014_paper_01.pdf
Received on Thursday, 2 October 2014 15:44:38 UTC