W3C home > Mailing lists > Public > public-rdb2rdf-wg@w3.org > November 2011

Re: Revised SKOS-based translation table proposal

From: Souripriya Das <SOURIPRIYA.DAS@oracle.com>
Date: Sun, 27 Nov 2011 09:10:35 -0800 (PST)
Message-ID: <5657c044-955d-44dc-acbe-90c6f55fbb08@default>
To: <richard@cyganiak.de>
Cc: <public-rdb2rdf-wg@w3.org>
Richard,

Clearly, the simplified proposal you had presented has a "phantom triples" problem and hence does not need any further consideration. In fact, the first part of the original proposal (illustrated by the first example which does not include use of the skos:*Match predicates) was pretty much the same as this simplified proposal and hence we will not consider that either.

Let us then go back to the second part of the original SKOS-based proposal and compare it with the alternate "R2RML-native" proposal that we had proposed. 

1) The SKOS-based approach is limited to (many-to-one) Literal-DBterm to IRI-RDFterm translation. This limitation comes from use of the following two properties: (an owl:DatatypeProperty) skos:notation ("used to assign a notation as a typed literal") [7] and (an owl:ObjectProperty) skos:*Match [8]. The R2RML-native scheme, on the other hand, has no such limitation and allows (many-to-one) IRI-or-Literal-DBterm to IRI-or-Literal-RDFterm translation.

2) Also, use of R2RML-native scheme produces less verbose Turtle documents than does use of the SKOS-based scheme (because, unlike the R2RML scheme, the SKOS-based scheme actually requires a translation scheme IRI to be explicitly specified for every translation in that scheme that goes to a different IRI-RDFterm).

In conclusion, we probably should avoid borrowing from a different scheme (SKOS in this case), esp. when the borrowing causes loss of expressive power. Instead, let us consider staying native and holding on to the higher expressive power (and as a bonus, enable users to write more concise mapping documents).

The following example (same one as in our last email) illustrates the verbosity comparison.

--------------------------  Using SKOS-based scheme ([2], second part)  ------------------
To express the example using the (second part of the) original SKOS-based approach, we need the following triples:

<InternationalCuisineTranslationScheme> a skos:ConceptScheme.

[] a skos:Concept;
    skos:inScheme <InternationalCuisnineTranslationScheme>;
    skos:notation "Lo Mein", "Fu Chi Fei Pian" ;
    skos:broadMatch <Chinese>.

<ChineseCuisineTranslationScheme> a skos:ConceptScheme.

[] a skos:Concept;
    skos:inScheme <ChineseCuisineTranslationScheme>;
    skos:notation "Lo Mein" ;
    skos:broadMatch <Chinese>.

[] a skos:Concept;
    skos:inScheme <ChineseCuisineTranslationScheme>;
    skos:notation "Fu Chi Fei Pian" ;
    skos:broadMatch <Sichuan>.

-----------------  Using R2RML-native scheme ([4], extending to many-to-one) ---------------
Using the alternate proposal, and extending it a bit to allow >1 cardinality for rr:fromTerm (to allow mapping many DBterms to one RDFterm), we can express the above situation as follows (benefits: every translationMap is neatly enclosed within a TranslationScheme boundary AND no triples representing a NON-EXISTENT translation):

  <InternationalCuisineTranslationScheme> rr:translationMap
    [ rr:toTerm  <Chinese> ;
      rr:fromTerm "Lo Mein", "Fu Chi Fei Pian" ;
    ] .

  <ChineseCuisineTranslationScheme> rr:translationMap
    [ rr:toTerm <Chinese> ;
      rr:fromTerm "Lo Mein" ;
    ] ,
    [ rr:toTerm <Sichuan>
      rr:fromTerm "Fu Chi Fei Pian" ;
    ] .

-------------------------------------------------------------------------------------------

Thanks,
- Souri.

----- Original Message -----
From: richard@cyganiak.de
To: SOURIPRIYA.DAS@oracle.com
Cc: public-rdb2rdf-wg@w3.org, SEEMA.SUNDARA@oracle.com
Sent: Monday, November 21, 2011 1:56:57 PM GMT -05:00 US/Canada Eastern
Subject: Re: Revised SKOS-based translation table proposal

Hi Souri, hi Seema,

The issue you describe is a result of the simplification I did in response to your and David's feedback. In the original proposal, there would have been skos:broadMatch mappings between the two separate database-specific controlled vocabularies on the one side, and the “global” cuisine concept scheme on the other side.

SKOS is a W3C Recommendation. It is the third-most used vocabulary on the linked data web. It's used by the Library of Congress, the UK government, the European Commission's Publication Office, the United Nation's Food and Agricultural Organization, and the New York Times. I find it highly irresponsible for this WG to reject the use of an established international standard simply because it “seems too complicated” while not even considering its technical merits.

Best,
Richard


On 21 Nov 2011, at 15:39, Souripriya Das wrote:

> We think that there is a "phantom triples" problem (i.e., may generate triples that should not be there) with the SKOS-based scheme for representing many-to-one mapping of DBterms to RDFterms as illustrated by the following example where Translation Scheme B is more discerning than Translation Scheme A:
> 
>  <RDFterm1, TranslationSchemeA, DBterm1>
>  <RDFterm1, TranslationSchemeA, DBterm2>
> 
>  <RDFterm1, TranslationSchemeB, DBterm1>
>  <RDFterm2, TranslationSchemeB, DBterm2>
> 
> Here is a relational form:
> 
>  RDFterm               TranslationScheme (skos:inScheme)                  DBterm (skos:notation)
>  ----------------      ------------------------                           -------------------
>  RDFterm1              TranslationSchemeA                                 DBterm1
>  RDFterm1              TranslationSchemeA                                 DBterm2
> 
>  RDFterm1              TranslationSchemeB                                 DBterm1
>  RDFterm2              TranslationSchemeB                                 DBterm2
> 
> Since the proposed structuring of the <DBterm> to <RDFterm> mapping in the SKOS-based scheme is of the form:
> 
>  <RDFterm>
>    skos:inScheme <MappingScheme> ;
>    skos:notation <DBterm> .
> 
> translation of the above table to RDF, using (the non-unique) RDFterm column as subject (as implied by the SKOS-based scheme), generates the following INCORRECT set of RDF triples (using Turtle syntax):
> 
>  # generated from table row numbers 1, 2, and 3:
>  <RDFterm1>
>    skos:inScheme <TranslationSchemeA>, <TranslationSchemeB>;
>    skos:notation "DBterm1", "DBterm2" .
> 
>  # generated from table row number 4:
>  <RDFterm2>
>    skos:inScheme <TranslationSchemeB> ;
>    skos:notation "DBterm2" .
> 
> The above set of triples is INCORRECT because it includes the following NON-EXISTENT translation as a triple:
>  <RDFterm1>  <TranslationSchemeB>   "DBterm2" .
> 
> ---------------
> In the alternate proposal using R2RML-native properties and class (extended to allow many-to-one mapping), the set of triples would be as follows (exactly as intended):
> 
>  <TranslationSchemeA> rr:translationMap
>    [ rr:toTerm  <RDFterm1> ;
>      rr:fromTerm "DBterm1", "DBterm2" ;
>    ] .
> 
>  <TranslationSchemeB> rr:translationMap
>    [ rr:toTerm <RDFterm1> ;
>      rr:fromTerm "DBterm1" ;
>    ] ,
>    [ rr:toTerm <RDFterm2>
>      rr:fromTerm "DBterm2" ;
>    ] .
> 
> Here is a more concrete (Chinese food :-)) version of this example: 
> ---------------------------------------------------------------------------------------------------------
> Suppose we would like to translate as follows:
>   1) "Lo Mein" translated 
>        to <Chinese> using both "InternationalCuisnineTranslationScheme" and "ChineseCuisineTranslationScheme"
> 
>   2) "Fu Chi Fei Pian" translated 
>        to <Chinese> using "InternationalCuisnineTranslationScheme" and
>        to <Sichuan> using "ChineseCuisnineTranslationScheme",
> 
> ----------------------------------------------------------------------------------------------------------
> Using the SKOS-based scheme this can be expressed as follows:
> 
>  <Chinese>
>    skos:inScheme <InternationalCuisnineTranslationScheme> ;
>    skos:notation "Lo Mein", "Fu Chi Fei Pian" .
> 
>  <Chinese>
>    skos:inScheme <ChineseCuisineTranslationScheme> ;
>    skos:notation "Lo Mein" .
> 
>  <Sichuan>
>    skos:inScheme <ChineseCuisineTranslationScheme> ;
>    skos:notation "Fu Chi Fei Pian" .
> 
> Note that the following triple actually gets repeated if we translate the above Turtle to N-Triples:
>  <Chinese> skos:notation "Lo Mein" .
> 
> The above Turtle can be compacted to the following equivalent version (after removing the duplicate triple):
>  <Chinese>
>    skos:inScheme <InternationalCuisnineTranslationScheme>, <ChineseCuisineTranslationScheme> ;
>    skos:notation "Lo Mein", "Fu Chi Fei Pian" .
> 
>  <Sichuan>
>    skos:inScheme <ChineseCuisineTranslationScheme> ;
>    skos:notation "Fu Chi Fei Pian" .
> 
> The above set of triples is INCORRECT because it includes the following NON-EXISTENT translation as a triple:
>  <Chinese>  <ChineseCuisineTranslationScheme>   "Fu Chi Fei Pian" .
> 
> -------------------------------------------------------------------------------------------------------------------
> Using the alternate proposal, and extending it a bit to allow >1 cardinality for rr:fromTerm (to allow mapping many DBterms to one RDFterm), we can express the above situation as follows (benefits: every translationMap is neatly enclosed within a TranslationScheme boundary AND no triples representing a NON-EXISTENT translation):
> 
>  <InternationalCuisnineTranslationScheme> rr:translationMap
>    [ rr:toTerm  <Chinese> ;
>      rr:fromTerm "Lo Mein", "Fu Chi Fei Pian" ;
>    ] .
> 
>  <ChineseCuisineTranslationScheme> rr:translationMap
>    [ rr:toTerm <Chinese> ;
>      rr:fromTerm "Lo Mein" ;
>    ] ,
>    [ rr:toTerm <Sichuan>
>      rr:fromTerm "Fu Chi Fei Pian" ;
>    ] .
> 
> ----------------------------------------------------------------------------------------------------
> 
> In summary, overall, the alternate proposal, as extended, now supports:
> 
> 1) many-to-one mapping from (one or more) DBterms to RDFterm
> 2) Both RDFterm and DBterm can be any type of RDF term -- that is, IRI or Literal
> 3) Use of a translation scheme as anchor (for a group of translation maps) gives it an intuitive organization
> 4) the new rr: terms (rr:translationMap, rr:TranslationMap class, rr:toTerm, rr:fromTerm) are intuitive as well
> 
> Given this we still believe that the alternate proposal is more expressive and easy to use.
> 
> Thanks,
> - Souri and Seema
> 
> ----- Original Message -----
> From: richard@cyganiak.de
> To: public-rdb2rdf-wg@w3.org
> Sent: Tuesday, November 15, 2011 7:03:34 AM GMT -05:00 US/Canada Eastern
> Subject: Revised SKOS-based translation table proposal
> 
> Regarding ISSUE-72 “Bring back R2RML lookup tables” [1], here's a new proposal:
> 
>   http://www.w3.org/2001/sw/rdb2rdf/drafts/translation-tables-DERI2.html
> 
> It is still SKOS-based like the first proposal [2], but drops the possibility of using skos:xxxMatch properties, and retains only the ability to use skos:notation, as suggested by David [3].
> 
> This makes it simpler than the Oracle proposal [4] in terms of new properties introduced and total triples needed to express a translation table. Since Souri's and Seema's objection to the original proposal [2] was about its complexity [5], I'm confident that the revised proposal is acceptable to them.
> 
> (The new proposal retains the ability to express non-bijective mappings, and is limited to mapping to IRIs. This differs from the Oracle proposal, which can only express bijective mappings, but can also map to literals.)
> 
> Ivan noted [6] that re-adding this feature would require a second last call.
> 
> Best,
> Richard
> 
> 
> [1] http://www.w3.org/2001/sw/rdb2rdf/track/issues/72
> [2] http://www.w3.org/2001/sw/rdb2rdf/drafts/translation-tables-DERI.html
> [3] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Aug/0186.html
> [4] http://www.w3.org/2001/sw/rdb2rdf/drafts/translation-tables-Oracle.html
> [5] http://www.w3.org/2001/sw/rdb2rdf/track/issues/66
> [6] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0013.html
> 


[7] http://www.w3.org/TR/skos-reference/#L2557
[8] http://www.w3.org/TR/skos-reference/#L4186
Received on Sunday, 27 November 2011 17:11:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 27 November 2011 17:11:28 GMT