- From: Souripriya Das <SOURIPRIYA.DAS@oracle.com>
- Date: Thu, 8 Dec 2011 12:32:58 -0800 (PST)
- To: <public-rdb2rdf-wg@w3.org>
Let us take a concrete example to use for comparing the two R2RML-native proposal and SKOS-based proposal: A database table SCOTT.USA_CITIES (CITY, STATE, LATITUDE, LONGITUDE) has about 30000 rows. Each row has a unique <CITY, STATE> pair and the LATUTUDE and LONGITUDE info (for the city located in the specified state). We want to map ONLY the following <CITY, STATE> pairs to actual URLs used by their respective city governments: - New York, NY => http://www.nyc.gov - Boston, MA => http://www.cityofboston.gov - Atlanta, GA => http://www.atlantaga.gov - Miami, FL => http:www.miamigov.com - Dallas, TX => http://www.dallascityhall.com - Los Angeles, CA => http://www.lacity.org - San Francisco, CA => http://www.sfgov.org - Seattle, WA => http://www.seattle.gov - Chicago, IL => http://www.cityofchicago.org So, using the native-proposal (where we will allow partial map (as used below) - probably a critical requirement in practice), we can express the R2RML map including the translation table as follows: x:CityStateTriplesMap rr:logicalTable [ rr:tableName "\"SCOTT\".\"USA_CITIES\"" ] rr:subjectMap [ rr:template "http://www.city.{CITY}.{STATE}.us" ; rr:translationScheme x:myTranslationScheme ] ; rr:propertyObjectMap [ rr:predicate cs:city ; rr:objectMap [ rr:column "CITY" ] ] ; rr:propertyObjectMap [ rr:predicate cs:state ; rr:objectMap [ rr:column "STATE" ] ] ; rr:propertyObjectMap [ rr:predicate cs:latitude ; rr:objectMap [ rr:column "LATITUDE" ] ] ; rr:propertyObjectMap [ rr:predicate cs:longitude ; rr:objectMap [ rr:column "LONGITUDE" ] ] . x:myTranslationScheme rr:translationMap [ rr:toTerm <http://www.nyc.gov> ; rr:fromTerm <http://www.city.New%20York.NY.us> ] ; [ rr:toTerm <http://www.cityofboston.gov> ; rr:fromTerm <http://www.city.Boston.MA.us> ] ; [ rr:toTerm <http://www.atlantaga.gov> ; rr:fromTerm <http://www.city.Atlanta.GA.us> ] ; [ rr:toTerm <http:www.miamigov.com> ; rr:fromTerm <http://www.city.Miami.FL.us> ] ; [ rr:toTerm <http://www.dallascityhall.com> ; rr:fromTerm <http://www.city.Dallas.TX.us> ] ; [ rr:toTerm <http://www.lacity.org> ; rr:fromTerm <http://www.city.Los%20Angeles.CA.us> ] ; [ rr:toTerm <http://www.sfgov.org> ; rr:fromTerm <http://www.city.San%20Francisco.CA.us> ] ; [ rr:toTerm <http://www.seattle.gov> ; rr:fromTerm <http://www.city.Seattle.WA.us> ] ; [ rr:toTerm <http://www.cityofchicago.org> ; rr:fromTerm <http://www.city.Chicago.IL.us> ] . To allow a proper comparison, please express this partial mapping using the SKOS-based approach. Thanks, - Souri. ----- Original Message ----- From: richard@cyganiak.de To: souripriya.das@oracle.com Cc: public-rdb2rdf-wg@w3.org Sent: Tuesday, December 6, 2011 12:09:55 PM GMT -05:00 US/Canada Eastern Subject: Re: Revised SKOS-based translation table proposal Hi Souri, On 27 Nov 2011, at 17:10, Souripriya Das wrote: > Let us then go back to the second part of the original SKOS-based proposal and compare it with the alternate "R2RML-native" proposal that we had proposed. > > 1) The SKOS-based approach is limited to (many-to-one) Literal-DBterm to IRI-RDFterm translation. This limitation comes from use of the following two properties: (an owl:DatatypeProperty) skos:notation ("used to assign a notation as a typed literal") [7] and (an owl:ObjectProperty) skos:*Match [8]. The R2RML-native scheme, on the other hand, has no such limitation and allows (many-to-one) IRI-or-Literal-DBterm to IRI-or-Literal-RDFterm translation. It is correct that the SKOS-based proposal only supports literal-to-IRI mappings. But this is not a limitation. Mapping *from* an IRI to something else is nonsensical in the context of R2RML. The values we map from are always SQL data values and hence literals and never IRIs. Mapping *to* a literal is potentially useful and I can imagine mapping scenarios where this would be useful. However, the reason why the WG decided to include translation tables in the first place was the following item in our charter: [[ The mapping language MUST allow for a mechanism to create identifiers for database entities. ]] http://www.w3.org/2009/08/rdb2rdf-charter Addressing this charter item requires only mapping to IRIs. As I said before, the SKOS-based scheme can be extended to allow mapping to literals if that's what we want, albeit at a cost of one more triple per mapping pair. Mapping "Lo Mein" to <Chinese> would look like this: [] skos:inScheme <scheme1>; skos:notation "Lo Mein"; skos:broadMatch <Chinese>. Mapping "Lo Mein" to "Chinese" would be: [] skos:inScheme <scheme1>; skos:notation "Lo Mein"; skos:broadMatch [ skos:notation "Chinese" ]. I think that this is a bit of a corner case and I don't think it needs to be supported in R2RML 1.0. I just want to show that the existing design of SKOS already accommodates this; so future R2RML versions could require support for fancier mappings. > 2) Also, use of R2RML-native scheme produces less verbose Turtle documents than does use of the SKOS-based scheme (because, unlike the R2RML scheme, the SKOS-based scheme actually requires a translation scheme IRI to be explicitly specified for every translation in that scheme that goes to a different IRI-RDFterm). Actually it is not more verbose. The SKOS approach and the custom-vocabulary approach require the *same* number of triples. The example you presented has unnecessary extra rdf:type triples that are not required by the spec. Let's remove them and compare: ------ SKOS-based approach ------ [] skos:inScheme <InternationalCuisnineTranslationScheme>; skos:notation "Lo Mein", "Fu Chi Fei Pian"; skos:broadMatch <Chinese>. [] skos:inScheme <ChineseCuisineTranslationScheme>; skos:notation "Lo Mein"; skos:broadMatch <Chinese>. [] skos:inScheme <ChineseCuisineTranslationScheme>; skos:notation "Fu Chi Fei Pian"; skos:broadMatch <Sichuan>. ------ Custom vocabulary approach ------ <InternationalCuisineTranslationScheme> rr:translationMap [ rr:toTerm <Chinese> ; rr:fromTerm "Lo Mein", "Fu Chi Fei Pian" ; ] . <ChineseCuisineTranslationScheme> rr:translationMap [ rr:toTerm <Chinese> ; rr:fromTerm "Lo Mein" ; ] , [ rr:toTerm <Sichuan> rr:fromTerm "Fu Chi Fei Pian" ; ] . --------------------------------- Ten triples in both cases. We see that the custom-vocabulary approach can be written in less *characters* because the direction of the rr:translationMap property is opposite to skos:inScheme, and hence the comma-based Turtle syntactic sugar can be used. I would like to point out here that (i) you have argued in the past that syntax doesn't matter and R2RML is all about the model. There's no difference in the model between both approaches. (ii) you have in the past expressed strong objections against designs that were supposed to make typical R2RML expressions more compact, so I suppose you agree that other concerns can sometimes overrode the desire for less verbose R2RML expressions. As I see it, we have the choice between (a) adopting an established and popular international standard designed by the W3C, or (b) save some keystrokes by exploiting a syntactic idiosyncrasy of Turtle, and in the process re-invent the wheel. As I said earlier: >> SKOS is a W3C Recommendation. It is the third-most used vocabulary on the linked data web. It's used by the Library of Congress, the UK government, the European Commission's Publication Office, the United Nation's Food and Agricultural Organization, and the New York Times. I learned this week that we can add the Hungarian and German National Libraries and the British Museum to that list. Finally, it's worth pointing out OpenLink's position again: http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Sep/0018.html Are you sure that the advantages of the custom-vocabulary approach (less bytes, easier support for use cases that require mapping to literals) outweigh the advantage of using a standard? Thanks, Richard
Received on Thursday, 8 December 2011 20:33:44 UTC