- From: Lars Marius Garshol <larsga@ontopia.net>
- Date: Fri, 18 Feb 2005 09:33:37 +0100
- To: "SWBPD list" <public-swbp-wg@w3.org>
I've just read through the latest version of the survey; my comments are below. Generally, I think the document is too long. The discussions of the five proposals would probably benefit from cutting some of the author rationales and stating the conversion rules more systematically and briefly (if possible), and they should also be more complete. --- 1.2 Overview of proposals In the descriptions of the five proposals a substantial percentage of the verbiage is devoted to which conference each was presented at. This doesn't really seem like it is adding anything. The dates may be useful, but I suggest removing mentions of the actual conferences. (This information should be in the bibliography anyway.) The section on /Garshol/ is a bit clunky. Maybe best to move the discussion of [Garshol02] down together with the others which are not examined in detail? (The text about it can be kept as it is.) /Garshol/ is actually implemented twice: once in OKS and once in tmapi-utils[1]. (The latter implementation is not complete yet, but is likely to be so before this survey is finished.) Kaminski's thesis should definitely be mentioned here. --- 2.1 Basic features ... I spent a lot of time writing a reasoned critique of this list of criteria[2]. Was this ever received? I have received no response to it, and the most important suggestion in it does not seem to have been followed. (For the record: all of the comments still seem to apply.) --- 3.1.1 The description of the author's reasoning is very clear, but having read the section I am really not much wiser as to how the proposal works. This really seems to be the wrong way around; surely Moore's proposal is more interesting than the reasoning behind it? Further, the RDF->TM and TM->RDF proposals are not clearly distinguished from one another (see [2]), which makes the confusion worse. I realize that the ultimate source of confusion here may be Moore, but I still think this section would benefit from focusing much more on the proposed conversions. I also think distinguishing what must effectively be two independent proposals from one another as advocated in [2]. --- 3.1.2 Reversibility: Neither approach is reversible. In the case of the "modelling" approach, the assumption is that one is working in one domain or the other, but not in both. In the case of the "mapping" approach, the fact that a statement maps to a single association whereas an association maps to two statements shows that translations cannot be reversed. Are there not really four approaches? Two modelling approaches (one for each direction) and two mapping approaches? To judge reversibility probably each should be examined individually. (Or, alternatively, we could follow Moore's deprecation of the modelling approaches and consider only the mapping approaches.) That a statement maps to a single associations would be reversible if /Garshol/ or /UniBo/ were used to do the reverse mapping. Ie: it *is* reversible. That an association maps to two statements would actually also be reversible with /Garshol/ (and perhaps also /UniBo/). (The two statements would create two associations, which would merge into one.) However, since constructs not discussed would necessarily be lost the conversion as a whole would not be reversible. The discussion of the fidelity of the mapping approach seems to me further proof that the argument in [2] is correct. Given that I would advocate a new structure for 3.1 as follows: 3.1 Moore 3.1.1 General discussion (very similar to current 3.1.1) 3.1.2 RDF->TM proposal 3.1.2.1 Description (mostly lifted from 3.1.1, but extended if possible) 3.1.2.2 Analysis (half of 3.1.2, essentially) 3.1.2.3 Test case 3.1.3 TM->RDF proposal 3.1.3.1 Description (mostly lifted from 3.1.1, but extended if possible) 3.1.3.2 Analysis (half of 3.1.2, essentially) 3.1.3.3 Test case Some of the sections could perhaps be merged, but I broke them up for clarity here. Note that this is likely to be much less work than it would seem, given that the only new text required is more description of the two conversions, and that would IMHO be required anyway. --- 3.1.3.2 The last paragraph here seems like it was left over from the previous version during editing. --- 3.2.1 The term "bijective" used by Lacher & Decker may actually be quite useful for us in our discussion of reversibility. See this Wikipedia article: <URL: http://en.wikipedia.org/wiki/Bijection > I think what we want are injective proposals, but not necessarily bijective ones. Moore's proposals (by section 3.1.3) clearly are not injective. (This is obvious as all TMs containing two topics and one association between them would map to the same RDF model, regardless of any differences in topic names and occurrences.) In PMTM4 strings are not part of the model, either, if I remember correctly. --- 3.2.2 Reversibility: The transformation is theoretically reversible but this is of academic interest since the proposal only covers one direction. I may be sounding like a broken record by now, but it really is interesting to us if the proposal is reversible. Had it had fidelity reversibility and completeness would have been the next things to consider. Regarding correctness I don't quite know what to say. It does seem to model PMTM4 correctly, but I would say that PMTM4 does not model topic maps correctly, and so it would be difficult to consider this proposal entirely correct, without this necessarily being the fault of the authors. Fidelity: ... an information content ... Maybe just "information" would be better? --- 3.3.1 Does /Ogievetsky/ really require the use of XSLT, or is it just that the proposal is implemented in XSLT? (This is in part answered later, but it seems strange to state it in this way here.) Does /Ogievetsky/ use the same PMTM4 version as /Stanford/? In the composed-by example (which I agree is only readable in RDF/XML form :-) it's not clear to me why the rtm:member node is there. The prose above mentions the existence of this node in passing, but does not say anything about the rationale for it, either. To me it seems like pure graph bloat, but presumably there was a reason for it? This translates to a topic map consisting of six TAOs (five topics and one association), which in turn translates back to RDF as a set of no less than [@@fixme] RDF statements. "Obviously we accumulated a lot of semantic luggage during our roundtrip" is Ogievetsky's laconic comment. *LOL* In addition, a brief comparison is made with a tolog-like query language. The language in question looks an awful lot like early versions of RDQL. Could that be what it is? --- 3.3.2 Same comment on reversibility here as elsewhere. /Ogievetsky/ seems to me for the most part correct, except that in associations different properties are used for topics depending on their identity. Maybe this isn't incorrect in a strict sense, but it certainly seems highly questionable, since the association should be the same in either case. (Neither TMDM nor PMTM4 would have different association structures in this case, and XTM makes it clear that the two cases are equivalent.) --- 3.4.1 The term "subject address" is used in places, instead of the correct "subject locator". I don't think the TM->RDF proposal is described in sufficient depth here. Given the lack of usable proposals for this direction, that seems a serious shortcoming. (The UniBo proposal in this direction is discussed in great detail, and I think the discussion in Garshol03a would be valuable for comparison.) --- 3.4.2 Probably the analysis should be extended somewhat to make it clearer what, precisely, the failings of the proposal are. (This does not seem to be as important in 3.1-3.3, where the failings are more obvious.) --- 3.5.1 This mixes the RDF->TM and TM->RDF conversions together, making the discussion rather difficult to follow. It would be easier if they were separated. The RDF->TM conversion also seems to have received less attention than it should. The default behaviour in the Unibo proposal is to equate subject addresses with resource URIs and to represent subject identifiers using the RDFS property isDefinedBy. Topics that have no subject address are translated to blank nodes whose ID is generated from the topic's base name. This seems problematic, since the property in an RDF statement is required to have a URI. The Unibo proposal is alone is assuming a fundamental equivalence of semantics between base names and the rdfs:label Specific mappings: this section seems the most important, but is too brief for me to be able to understand it. --- 3.5.2 Regarding reversibility: The proposal permits a high degree of reversibility, but the result of a round-trip may not be the same as the starting point. I would claim that that fails the test of reversibility. For example, using the generic mappings, most RDF statements would be converted to typed associations with untyped roles [...] This seems a classic example of a failure in reversibility, since the information about which topic was the subject and which the object is lost. Regarding correctness: I have earlier pointed out errors in the mapping which are not mentioned here. Nor do the things I considered errors seem to be mentioned in 3.5.1, for what reason I don't know. I've also pointed out another problem above, and I could list more if desired. --- 4 What is the purpose of this section? It seems to be about to turn into a sixth proposal rather than a survey. Maybe it should revert to being a survey? --- 4.1 Semantic mappings have much higher fidelity but suffer from the disadvantage of tending to be less complete and requiring additional information that is not normally present in the source document. I would say that the disadvantage of semantic mappings is that making them complete is harder. (The point about additional information still applies, of course.) --- 4.2 I think a set of requirements, based on the evaluation criteria, would be useful. Or even, a set of requirements, used to evaluate the proposals. --- 4.2.1 The problem with rdfs:isDefinedBy has been pointed out earlier. Otherwise it mostly seems OK, but inserting rdf:type statements when going TM->RDF seems questionable. Regarding owl:sameAs and RDF->TM: why not just merge? (Actually, there are more OWL properties with the same semantics.) --- 4.2.2 The semantic equivalence between topic names and the rdfs:label property is fairly obvious. They are not equivalent. rdfs:label implies a topic name, but the inverse does not hold. TM->RDF: I prefer Garshol03a. RDF->TM: What about properties that have literal values, yet are *not* subproperties of rdfs:label? Very few vocabularies bother spelling this out, and the same goes for instance data. The proposed approach is semantically more "correct" than Garshol03, but I think Garshol03 is more likely to actually give the correct results. The draft says: In a semantic mapping there are two approaches that can be taken to handling variant names: reification and complex objects. True, but faced with this choice I think many users will opt for the third alternative: ignore variants altogether. Reification and complex objects are both painful and heavy-weight. However, RDF implementations with optimized support for reification will handle the former more gracefully. I think handling this is likely to involve significant pain, no matter what we do. Anyway, I don't think recommendations belong in the survey. [1] <URL: http://tmapi-utils.sourceforge.net/ > [2] <URL: http://lists.w3.org/Archives/Public/public-swbp-wg/2005Feb/0089.html > -- Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net > GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >
Received on Friday, 18 February 2005 08:34:40 UTC