- From: Ivan Mikhailov <imikhailov@openlinksw.com>
- Date: Wed, 25 Aug 2010 21:12:31 +0700
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Richard, > I can understand Souri's decision to base his initial work on XML. But > I believe that XML is not the best choice of syntax for R2RML. I > instead propose that R2RML mappings should themselves be RDF graphs, > with Turtle or RDF/XML as the default syntax for writing R2RML files. > > Here is why. > 1. CHARTER REQUIREMENTS > 2. PREFIX HANDLING > 3. EXTENSIBILITY > 4. COMMUNITY EXPECTATIONS > 5. SUITABILITY OF XML FOR CONFIGURATION I already use RDF for mapping metadata in Virtuoso. So I vote for RDF, even if reasons were different (and thus continue the list). 6. INCREMENTAL COMPOSING I don't have to fill a whole RDF graph at once whereas I had to compose an XML document as, well, one document. Yes I can split it into generic entities but it is convenient only of these entities form some sequence. I can append something to an XML document by adding an external entity and a ref, but not adding subsections to existing entities. At the same time, inserting a new subgraph into existing graph is no more than a LOAD or some INSERT(s). 7. AUTOMATIC AUDIT, RECOVERY, DIFF+PATCH With metadata stored as RDF, I can check the integrity by a set of simple SPARQL queries. If something is screwed up I can cure the problem by trivial delete of suspicious metadata. This is extremely important if make independent RDB applications share same RDF storage and even map their data to "shared" graphs of the storage. I can backup and restore metadata by SPARUL, I can make diffs and apply patches, I can make garbage collection --- and as long as all these administrative routines are based on SPARQL they can be made reusable at least across sequence of versions of an RDB2RDF product, if not across products of different vendors. 8. CHEAP TESTING With mapping described in RDF, the testing tool can be flexible and query metadata about mappings to test and actual data produced by the mapping in one SPARQL query, or at least do both sorts of operations in one language --- in SPARQL. With XML it would require a weird mix of XQUERY and SPARQL. A typical Virtuoso installation with RDB2RDF mapping in use is 10-20 applications, 100-2000 RDB2RDF mapping rules each, each application is upgraded 3-4 times per year, upgrades are independent from each other. It means one new mapping rule per average working hour. Then mix the mapped relational data with "native" RDF quads. Add security restrictions. The configuration of this nightmare is not a stable document for reading from beginning to end because it will be changed before the end is reached. Metadata about a knowledge base is a knowledge base by itself, so RDF is a natural choice. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com
Received on Wednesday, 25 August 2010 15:11:22 UTC