RDF/Topic Maps Interoperability Survey
SWBPD WG F2F
Boston
March 4th 2005
These slides available at http://www.ontopia.net/work/survey-pres.html
Steve Pepper
Chief Strategy Officer, Ontopia
Coordinator, RDFTM TF
<pepper@ontopia.net>
Survey: Purpose and target audience
- Purpose: To prepare the ground for the Guidelines
- Demonstrate that the Guidelines are based on a solid assessment of previous work
- Clarify the issues for the Task Force members
- Develop an understanding of requirements and possible trade-offs
- Justify the adoption of the semantic mapping approach
- Target Audience: Specially interested parties only
- Anyone with a particularly deep interest in the problem and consequently greater than passing familiarity with both paradigms
- In particular, the pioneers of RDF/TM interoperability
- Estimated size of audience for Survey < 50
- NB: Estimated size of audience for Guidelines: > 50,000
Document structure
- Introduction
- Criteria for evaluating the proposals
- Existing translation proposals
- Moore
- Stanford
- Ogievetsky
- Garshol
- Unibo
- Analysis
- Conclusion
1 Introduction
- 1.1 Background
- Goal of RDFTM is to provide Guidelines for combining RDF/OWL and Topic Maps
- This survey is the first deliverable
- The second deliverable will be the Guidelines themselves
- 1.2 Overview of Proposals
- Lists five proposals that are "sufficiently complete and well-documented"
- Lists related work
2 Criteria for evaluating the proposals
- 2.1 Translation features
- 2.2 Major issues
- 2.3 Test Cases
2.1 Translation features
- Completeness
- The criterion completeness is used to evaluate the extent to
which each proposal is able to handle every semantic construct that can
be expressed in the source model and provide a means to represent it
without loss of information in the target model. A complete translation
will by definition be reversible.
- Fidelity
-
- The criterion fidelity expresses the degree to which
the results of a translation are faithful to the underlying conceptual
model of the target paradigm. This quality can be thought of as
naturalness, that is, as corresponding to the way in which someone
familiar with the target paradigm would naturally express the
information content in that paradigm. Naturalness normally also confers
improved readability on the result.
2.2 Major issues
Topic Maps issues | RDF issues |
Identity | Containers |
Scope | Collections |
N-ary associations | Language tags |
Association roles | XML literals |
Variants | Typed literals |
Reification | |
2.3 Test Cases
- Purpose of test cases
- To enable an initial evaluation of the criterion "fidelity".
- Not intended to be complete. Guidelines will use other test cases.
- TM2RDF test case
- Based on Steve Pepper's Italian Opera Topic Map
- The opera Tosca was premiered on 14th January 1900, has a synopsis
at a certain location, and was composed by the composer Giacomo
Puccini. All topics have a single subject identifier.
- RDF2TM test case
- Based on Masahide Kanzuki's Music Vocabulary
- A concert took place on a certain date, at a certain venue, with a
particular conductor and soloist. The location of the venue is given by
coordinates. The concert, venue, conductor and soloist all have labels
and are represented by blank nodes. All classes and properties have
labels.
3 Existing translation proposals (1)
(Presented in chronological order)
- 3.1 Moore (2001-03)
- First incomplete attempt based on own data models.
- Introduces "mapping the model" and "modelling the model" terminology.
- 3.2 Stanford (2001-07)
- TM2RDF only. Based on PMTM4. Object mapping, not semantic mapping.
- Incomplete (due to PMTM4). Verbose. Low fidelity.
- 3.3 Ogievetsky (2001-08)
- TM2RDF based on PMTM4 and XTM. Another object mapping.
- Fairly complete. Verbose. Low fidelity. Roundtripping attempt fails.
3 Existing translation proposals (2)
- 3.4 Garshol (2001-11, 2003-05)
- Based on comparison of concepts. Rejects object mapping for semantic mapping.
- Requires additional mapping info. Almost complete. High fidelity.
- 3.5 Unibo (2002, 2003-08)
- Similar to Garshol but falls back to object mapping when no additional info is available.
- Emphasizes need for defaulting solution. Adds more detail (e.g. variant names)
- 3.6 Other Proposals and Contributions
- [Prudhommeaux 02], [Vlist 01]
- [Garshol 02], [Kaminsky 02], [Pepper 03], [Vatant 04]
4 Analysis
- 4.1 Object mappings and semantic mappings
- 4.2 The importance of being faithful
- 4.3 Semantic mapping issues
4.1 Object mappings and semantic mappings
- Object mappings
- Object mappings use the low-level building blocks of one language to describe the object model of the other.
For example, assuming for now that the structure of a simple
binary associations data model is a quintuple, consisting of one
(a)ssociation, two (r)oles, and two role (p)layers (p-r-a-r-p), that
association would be represented as four statements that relate five
resources.
- Semantic mappings
- Semantic mappings start from high-level concepts that carry the
semantics of each model and attempt to find equivalences between them.
For example, a binary association in Topic Maps would be seen to
represent the same kind of "thing" that many RDF statements represent
(i.e., a relationship between two entities) and would therefore be
represented using a single RDF statement.
4.2 The importance of being faithful
Fidelity is important because the result of low-fi translation is structurally different from data created in the target model.
This leads to reduced interoperability, in the following ways:
- The result will not merge cleanly with data originating in the target model,
- The result will not conform to vocabularies created in the target model, and
- Queries written against the target model will not work with translated data.
Hi-fi and low-fi queries
Query with hi-fi semantic mapping |
SELECT ?c
WHERE (?m, <foaf:name>, "Lars ..."),
(?c, <dc:creator>, ?m)
|
Query with low-fi object mapping |
SELECT ?c
WHERE (?m, <tm:basename>, ?n),
(?n, <tm:value>, "Lars ..."),
(?r1, <tm:player>, ?m),
(?r1, <rdf:type>, <:creator>),
(?a, <tm:role>, ?r1),
(?a, <rdf:type>, <:created-by>),
(?a, <tm:role>, ?r2),
(?r2, <rdf:type>, <:creation>),
(?r2, <tm:player>, ?c)
|
4.3 Semantic mapping issues
- Identity
- Topic names
- Binary associations
- N-ary associations
- Occurrences
- Types and subtypes
- Reification
- Scope
5 Conclusion
- Semantic mappings better than object mappings for data interoperability
- Garshol and Unibo proposals closest to useable solution
NB Conclusion deliberately left sketchy until WG has provided feedback on first draft
Reviews
- Reviews received from Natasha Noy, Mike Uschold, David Wood, Ralph Swick (for which much thanks)
- Reviews mostly positive
- Lots of helpful comments
- Many of these will just be implemented without further discussion
- Would like the WG to take a position on some of the trickier issues
- A set of questions has been prepared...
Questions?
Questions! (1 of 2)
- Should the Survey be a tutorial or should it assume some familiarity with both RDF and Topic Maps on the part of the reader?
- Should OWL be covered in full, or only as it impacts data interoperability?
- Is it OK to mention commercial implementations provided this is done in context and with a purpose?
- Natasha questioned whether the survey is objective. Does it need to be? Isn't fairness enough?
- Is anyone not convinced by the argument for a semantic mapping?
Questions! (2 of 2)
- Should the two test cases have identical information content?
- Should the test case results be moved to separate documents?
- Are our "issues" really "requirements"? If not, do we need a set of
formal requirements in this document, or should they be in a separate
document?
- Should we use the term "naturalness" rather than "fidelity"?
- Is it acceptable to require mapping information?
RE: On the integration of Topic Maps and RDF
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0155.html
From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 21 Aug 2001 11:28:05 -0400
To: em@w3.org
Cc: lacher@db.stanford.edu, www-rdf-interest@w3.org
Message-Id: <20010821112805Z.pfps@research.bell-labs.com>
One important aspect of such mappings, for me, is whether
information expressed naturally in the source formalism (Topic Maps)
and then translated into the target formalism (RDF) can be naturally
integrated with information expressed naturally in the target
formalism. If this is not the case, then I claim that there is
something wrong with the translation.
I feel that the translation expressed in the paper does not
satisfy this criterion. Consider the example topic map in the paper,
which, among other things, expresses the fact that petroleum is a
natural resource of Denmark. It seems to me that the natural way of
expressing this in RDF is to have a resource representing Denmark (D),
a resource representing petroleum (P), and a predicate representing the
natural resource relationship (NR). Then the fact that Denmark has
petroleum as a natural resource is represented as the statement . The mapping in the paper uses much more machinery than this natural representation, including two reified statements.
Suppose some facts about natural resources come from topic
maps, and are represented in this translation to RDF, and other facts
about natural resources come from a natural RDF representation. How can
one query the RDF to find the union of the facts? Even if it is
possible to write a such a query is it at all possible to write such a
query without knowing that some of the natural resource facts come from
topic maps?
Peter F. Patel-Schneider
Bell Labs Research
RE: On the integration of Topic Maps and RDF
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0158.html
From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 21 Aug 2001 14:47:04 -0400
To: lacher@db.stanford.edu
Cc: em@w3.org, www-rdf-interest@w3.org, gdm@empolis.co.uk
Message-Id: <20010821144704U.pfps@research.bell-labs.com>
I think that we will have to "agree to disagree" on whether mappings between Topic Maps and RDF should preserve meaning.
I strongly, no, passionately, believe that such mappings have to be
model-mappins that preserve meaning, at least if one is to hold the
view that RDF is a representation formalism. If RDF is a representation
formalism, then positive ground binary relations have to be represented
as RDF triples. Otherwise, RDF is just some syntactic encoding, and the
entire meaning is conveyed in some outside-of-RDF (and, probably,
outside of the web) side agreement.
Any approach that requires an outside-of-RDF approach to ascribe
meaning to the resulting RDF has, in my opinion, lost everything. Yes,
an approach that stays within RDF has the potential of losing some
things, but at least the portion that can be naturally represented in
RDF is completely captured.
Peter F. Patel-Schneider
Bell Labs Research
RE: On the integration of Topic Maps and RDF
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Aug/0184.html
From: Martin Lacher <lacher@db.stanford.edu>
Date: Thu, 23 Aug 2001 16:27:11 -0700
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: <em@w3.org>, <www-rdf-interest@w3.org>, <gdm@empolis.co.uk>
Message-ID: <NEBBIEAODMJKOIMFPAEBGEHPCLAA.lacher@db.stanford.edu>
Hi Peter,
Thank you for your comments !
...
We are talking about two different things:
1) Your goal is to be able to query different sources without having
to know anything about the data model of the sources. That would be a
great solution, I would love to see that for RDF and Topic Maps without
substantial loss of information.
2) Our goal was to start out with a way to be able to query
different sources the data model of which we know. The agreements you
mentioned are partly specified in the RDF schema we defined (with the
exception of the representation of the hypergraph elements from the
Topic Map data model). The schema needs to be elaborated.
Considering our goal, I think our results are valid.
Cheers,
Martin