- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Wed, 25 Aug 2010 12:42:13 +0100
- To: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Harry correctly urges us to press forward with turning the SQL-Based Approach into a FPWD. There is one major obstacle though that needs to be tackled before work on a FPWD can be started: the question what syntax the language should use. Ashok has stated that we should talk about syntax “later”, but this discussion has to happen before serious work on an official draft starts. Once a draft is out, the public will assume that the syntax used in the draft is the official and canonical syntax for the language, and it is key to send the right signal there. Which means, the discussion has to happen now. I can understand Souri's decision to base his initial work on XML. But I believe that XML is not the best choice of syntax for R2RML. I instead propose that R2RML mappings should themselves be RDF graphs, with Turtle or RDF/XML as the default syntax for writing R2RML files. Here is why. 1. CHARTER REQUIREMENTS The RDB2RDF charter states: “The mapping language SHOULD have a human- readable syntax as well as XML and RDF representations of the syntax for purposes of discovery and machine generation.” [1] Using RDF kills three birds with one stone: It ticks the RDF box, it ticks the “human- readable syntax” box (Turtle), and it ticks the XML box (RDF/XML). 2. PREFIX HANDLING The language needs to refer to RDF vocabulary terms, which are identified by URIs, and are conventionally represented as QNames or CURIEs (that is, http://xmlns.com/foaf/0.1/Person is represented as foaf:Person). This means that the language needs features for establishing prefix mappings (associate "foaf" with "http://xmlns.com/foaf/0.1/ "). This is a source of pain in XML-based languages (cf. ongoing tensions over RDFa vs HTML5, and RIF's XML syntax). Using a language that has a built-in mechanism for establishing prefix mappings and expanding QNames/CURIEs would avoid this problem. 3. EXTENSIBILITY RDF gives us various ways of annotating mappings (e.g., providing additional documentation, versioning-related annotations, cross-links to other software artifacts etc) for free. For example, I could attack rdfs:comment, dc:modified, dc:creator and similar properties to any part of a mapping. It also provides a clear syntactical framework for vendor-specific extensions. 4. COMMUNITY EXPECTATIONS R2RML is a language for mapping databases to RDF. Hence, it bridges a world that speaks SQL to a world that speaks the RDF technology stack (RDF, SPARQL, RIF etc). Hence, arguments can be made for basing R2RML syntax on RDF (like in D2RQ or SquirrelRDF), or on SQL (like Virtuoso RDF Views), or on SPARQL (like Eric's approach), or on RIF. Basing R2RML on XML drags an unrelated third technology stack into the mix. 5. SUITABILITY OF XML FOR CONFIGURATION XML is a good syntax for text markup (cf. XHTML, DocBook, TEI). It works ok for transmitting structured data (cf. SOAP, Atom) albeit facing increasing competition from JSON. But I think it is now evident that using XML for configuration files that are edited and read directly by users is not a good idea. The most obvious drawback in this context is that you have to type everything <twice>...</twice>! To show what Souri's approach could look like if rendered in RDF (specifically Turtle), I took his example [2] and re-wrote it in Turtle [3] syntax. You can find it here: http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_in_Turtle A raw version of just the file [4] and auto-generated graph view [5] are also available. I propose to proceed based on the concepts of Souri's approach, but with an RDF serialization instead of XML as the surface syntax. Opinions? Best, Richard [1] http://www.w3.org/2009/08/rdb2rdf-charter.html [2] http://www.w3.org/2001/sw/rdb2rdf/wiki/Example_of_SQL-based_RDB2RDF_Mapping:_Revision_1 [3] http://www.w3.org/TeamSubmission/turtle/ [4] http://github.com/cygri/r2rml/raw/master/examples/emp-dept.ttl [5] http://bit.ly/asIik4
Received on Wednesday, 25 August 2010 11:42:49 UTC