RDF/Topic Maps Interoperability Survey

Introduction to the Draft Survey

SWBPD WG F2F

Boston

March 4th 2005

These slides available at http://www.ontopia.net/work/survey-pres.html

Steve Pepper

Chief Strategy Officer, Ontopia

Coordinator, RDFTM TF

<pepper@ontopia.net>

Survey: Purpose and target audience

Purpose: To prepare the ground for the Guidelines
- Demonstrate that the Guidelines are based on a solid assessment of previous work
- Clarify the issues for the Task Force members
- Develop an understanding of requirements and possible trade-offs
- Justify the adoption of the semantic mapping approach
Target Audience: Specially interested parties only
- Anyone with a particularly deep interest in the problem and consequently greater than passing familiarity with both paradigms
- In particular, the pioneers of RDF/TM interoperability
- Estimated size of audience for Survey < 50
  - NB: Estimated size of audience for Guidelines: > 50,000

Document structure

Introduction
Criteria for evaluating the proposals
Existing translation proposals
- Moore
- Stanford
- Ogievetsky
- Garshol
- Unibo
Analysis
Conclusion

1 Introduction

1.1 Background
- Goal of RDFTM is to provide Guidelines for combining RDF/OWL and Topic Maps
- This survey is the first deliverable
- The second deliverable will be the Guidelines themselves
1.2 Overview of Proposals
- Lists five proposals that are "sufficiently complete and well-documented"
- Lists related work

2 Criteria for evaluating the proposals

2.1 Translation features
2.2 Major issues
2.3 Test Cases

2.1 Translation features

Completeness: The criterion completeness is used to evaluate the extent to which each proposal is able to handle every semantic construct that can be expressed in the source model and provide a means to represent it without loss of information in the target model. A complete translation will by definition be reversible.
Fidelity
: The criterion fidelity expresses the degree to which the results of a translation are faithful to the underlying conceptual model of the target paradigm. This quality can be thought of as naturalness, that is, as corresponding to the way in which someone familiar with the target paradigm would naturally express the information content in that paradigm. Naturalness normally also confers improved readability on the result.

2.2 Major issues

Topic Maps issues	RDF issues
Identity	Containers
Scope	Collections
N-ary associations	Language tags
Association roles	XML literals
Variants	Typed literals
Reification

2.3 Test Cases

Purpose of test cases
- To enable an initial evaluation of the criterion "fidelity".
- Not intended to be complete. Guidelines will use other test cases.
TM2RDF test case
- Based on Steve Pepper's Italian Opera Topic Map
- The opera Tosca was premiered on 14th January 1900, has a synopsis at a certain location, and was composed by the composer Giacomo Puccini. All topics have a single subject identifier.
RDF2TM test case
- Based on Masahide Kanzuki's Music Vocabulary
- A concert took place on a certain date, at a certain venue, with a particular conductor and soloist. The location of the venue is given by coordinates. The concert, venue, conductor and soloist all have labels and are represented by blank nodes. All classes and properties have labels.

3 Existing translation proposals (1)

(Presented in chronological order)

3.1 Moore (2001-03)
- First incomplete attempt based on own data models.
- Introduces "mapping the model" and "modelling the model" terminology.
3.2 Stanford (2001-07)
- TM2RDF only. Based on PMTM4. Object mapping, not semantic mapping.
- Incomplete (due to PMTM4). Verbose. Low fidelity.
3.3 Ogievetsky (2001-08)
- TM2RDF based on PMTM4 and XTM. Another object mapping.
- Fairly complete. Verbose. Low fidelity. Roundtripping attempt fails.

3 Existing translation proposals (2)

3.4 Garshol (2001-11, 2003-05)
- Based on comparison of concepts. Rejects object mapping for semantic mapping.
- Requires additional mapping info. Almost complete. High fidelity.
3.5 Unibo (2002, 2003-08)
- Similar to Garshol but falls back to object mapping when no additional info is available.
- Emphasizes need for defaulting solution. Adds more detail (e.g. variant names)
3.6 Other Proposals and Contributions
- [Prudhommeaux 02], [Vlist 01]
- [Garshol 02], [Kaminsky 02], [Pepper 03], [Vatant 04]

4 Analysis

4.1 Object mappings and semantic mappings
4.2 The importance of being faithful
4.3 Semantic mapping issues

4.1 Object mappings and semantic mappings

Object mappings: Object mappings use the low-level building blocks of one language to describe the object model of the other.
For example, assuming for now that the structure of a simple binary associations data model is a quintuple, consisting of one (a)ssociation, two (r)oles, and two role (p)layers (p-r-a-r-p), that association would be represented as four statements that relate five resources.
Semantic mappings: Semantic mappings start from high-level concepts that carry the semantics of each model and attempt to find equivalences between them.
For example, a binary association in Topic Maps would be seen to represent the same kind of "thing" that many RDF statements represent (i.e., a relationship between two entities) and would therefore be represented using a single RDF statement.

4.2 The importance of being faithful

Fidelity is important because the result of low-fi translation is structurally different from data created in the target model.

This leads to reduced interoperability, in the following ways:

The result will not merge cleanly with data originating in the target model,
The result will not conform to vocabularies created in the target model, and
Queries written against the target model will not work with translated data.

Hi-fi and low-fi queries

Query with hi-fi semantic mapping
SELECT ?c WHERE (?m, <foaf:name>, "Lars ..."), (?c, <dc:creator>, ?m)
Query with low-fi object mapping
SELECT ?c WHERE (?m, <tm:basename>, ?n), (?n, <tm:value>, "Lars ..."), (?r1, <tm:player>, ?m), (?r1, <rdf:type>, <:creator>), (?a, <tm:role>, ?r1), (?a, <rdf:type>, <:created-by>), (?a, <tm:role>, ?r2), (?r2, <rdf:type>, <:creation>), (?r2, <tm:player>, ?c)

Query with hi-fi semantic mapping

SELECT ?c
WHERE (?m, <foaf:name>, "Lars ..."),
      (?c, <dc:creator>, ?m)

Query with low-fi object mapping

SELECT ?c
WHERE  (?m,  <tm:basename>, ?n),
       (?n,  <tm:value>,    "Lars ..."),
       (?r1, <tm:player>,   ?m),
       (?r1, <rdf:type>,    <:creator>),
       (?a,  <tm:role>,     ?r1),
       (?a,  <rdf:type>,    <:created-by>),
       (?a,  <tm:role>,     ?r2),
       (?r2, <rdf:type>,    <:creation>),
       (?r2, <tm:player>,   ?c)

4.3 Semantic mapping issues

Identity
Topic names
Binary associations
N-ary associations
Occurrences
Types and subtypes
Reification
Scope

5 Conclusion

Semantic mappings better than object mappings for data interoperability
Garshol and Unibo proposals closest to useable solution

NB Conclusion deliberately left sketchy until WG has provided feedback on first draft

Reviews

Reviews received from Natasha Noy, Mike Uschold, David Wood, Ralph Swick (for which much thanks)
Reviews mostly positive
Lots of helpful comments
Many of these will just be implemented without further discussion
Would like the WG to take a position on some of the trickier issues
A set of questions has been prepared...

Questions! (1 of 2)

Should the Survey be a tutorial or should it assume some familiarity with both RDF and Topic Maps on the part of the reader?
Should OWL be covered in full, or only as it impacts data interoperability?
Is it OK to mention commercial implementations provided this is done in context and with a purpose?
Natasha questioned whether the survey is objective. Does it need to be? Isn't fairness enough?
Is anyone not convinced by the argument for a semantic mapping?

Questions! (2 of 2)

Should the two test cases have identical information content?
Should the test case results be moved to separate documents?
Are our "issues" really "requirements"? If not, do we need a set of formal requirements in this document, or should they be in a separate document?
Should we use the term "naturalness" rather than "fidelity"?
Is it acceptable to require mapping information?

RE: On the integration of Topic Maps and RDF

http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0155.html

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 21 Aug 2001 11:28:05 -0400
To: em@w3.org
Cc: lacher@db.stanford.edu, www-rdf-interest@w3.org
Message-Id: <20010821112805Z.pfps@research.bell-labs.com>

One important aspect of such mappings, for me, is whether information expressed naturally in the source formalism (Topic Maps) and then translated into the target formalism (RDF) can be naturally integrated with information expressed naturally in the target formalism. If this is not the case, then I claim that there is something wrong with the translation.

I feel that the translation expressed in the paper does not satisfy this criterion. Consider the example topic map in the paper, which, among other things, expresses the fact that petroleum is a natural resource of Denmark. It seems to me that the natural way of expressing this in RDF is to have a resource representing Denmark (D), a resource representing petroleum (P), and a predicate representing the natural resource relationship (NR). Then the fact that Denmark has petroleum as a natural resource is represented as the statement . The mapping in the paper uses much more machinery than this natural representation, including two reified statements.

Suppose some facts about natural resources come from topic maps, and are represented in this translation to RDF, and other facts about natural resources come from a natural RDF representation. How can one query the RDF to find the union of the facts? Even if it is possible to write a such a query is it at all possible to write such a query without knowing that some of the natural resource facts come from topic maps?

Peter F. Patel-Schneider

Bell Labs Research

RE: On the integration of Topic Maps and RDF

http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0158.html

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 21 Aug 2001 14:47:04 -0400
To: lacher@db.stanford.edu
Cc: em@w3.org, www-rdf-interest@w3.org, gdm@empolis.co.uk
Message-Id: <20010821144704U.pfps@research.bell-labs.com>

I think that we will have to "agree to disagree" on whether mappings between Topic Maps and RDF should preserve meaning.

I strongly, no, passionately, believe that such mappings have to be model-mappins that preserve meaning, at least if one is to hold the view that RDF is a representation formalism. If RDF is a representation formalism, then positive ground binary relations have to be represented as RDF triples. Otherwise, RDF is just some syntactic encoding, and the entire meaning is conveyed in some outside-of-RDF (and, probably, outside of the web) side agreement.

Any approach that requires an outside-of-RDF approach to ascribe meaning to the resulting RDF has, in my opinion, lost everything. Yes, an approach that stays within RDF has the potential of losing some things, but at least the portion that can be naturally represented in RDF is completely captured.

Peter F. Patel-Schneider

Bell Labs Research

RE: On the integration of Topic Maps and RDF

http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0184.html

From: Martin Lacher <lacher@db.stanford.edu>
Date: Thu, 23 Aug 2001 16:27:11 -0700
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: <em@w3.org>, <www-rdf-interest@w3.org>, <gdm@empolis.co.uk>
Message-ID: <NEBBIEAODMJKOIMFPAEBGEHPCLAA.lacher@db.stanford.edu>

Hi Peter,

Thank you for your comments !

...

We are talking about two different things:

1) Your goal is to be able to query different sources without having to know anything about the data model of the sources. That would be a great solution, I would love to see that for RDF and Topic Maps without substantial loss of information.

2) Our goal was to start out with a way to be able to query different sources the data model of which we know. The agreements you mentioned are partly specified in the RDF schema we defined (with the exception of the representation of the hypergraph elements from the Topic Map data model). The schema needs to be elaborated.

Considering our goal, I think our results are valid.

Cheers,

Martin

RDF/Topic Maps Interoperability Survey

Introduction to the Draft Survey

Survey: Purpose and target audience

Document structure

1 Introduction

2 Criteria for evaluating the proposals

2.1 Translation features

2.2 Major issues

2.3 Test Cases

3 Existing translation proposals (1)

3 Existing translation proposals (2)

4 Analysis

4.1 Object mappings and semantic mappings

4.2 The importance of being faithful

Hi-fi and low-fi queries

4.3 Semantic mapping issues

5 Conclusion

Reviews

Questions?

Questions! (1 of 2)

Questions! (2 of 2)

RE: On the integration of Topic Maps and RDF

RE: On the integration of Topic Maps and RDF

RE: On the integration of Topic Maps and RDF