- From: Lars Marius Garshol <larsga@ontopia.net>
- Date: Mon, 24 Apr 2006 10:29:12 +0200
- To: SWBPD list <public-swbp-wg@w3.org>
This is my comments on http://www.ontopia.net/work/guidelines.html, version of March 20, 2006. The comments were written for the RDFTM editors rather than the list, so parts may be a bit obscure unless you know all the detail. Generally it seems that we've reached the point where probably most of the conversion rules are the way they should be, and what remains now is to work out what kind of document we want to produce, firm up the rules, and find a formalism for expressing them. I think we probably are quite close to the goal, provided we don't create too much work for ourselves. --- What is this document? I should start with noting that I find the title and abstract of the document misleading. This is not guidelines that authors can use for interoperability; this is a specification for a mapping between the two data models. I know it has been described as guidelines ever since the charter has been written, but that doesn't make this a guideline document. I'm not sure what to do about this, to be honest, and the problem recurs throughout. Throughout I would much prefer to see "this document" rather than "these guidelines", as the only thing that resembles guidelines in the document is 5-6 bullet points in 6.2. (I'll get back to why the count is 5-6, and not 10.) --- Tutorialism The document goes to great lengths to include a lot of tutorial material (explaining the two models, explaining the conversion procedure, explaining the document itself, examples, ...). This greatly increases the amount of work the TF has to do, and the more of it we can cut, the better off we are, I think. We have tight deadlines and very limited resources, and so I think it's better if we make the document as minimal as possible. Explanations and tutorials can always be written after the fact as conference papers etc and reach a greater audience that way. We also reduce the risk of inconsistencies (at the moment there are lots of them, and with the current format it will be an enormous task to make sure there are none left when we finish). IMHO it's especially important to cut all the examples. They represent a huge rewriting burden as we move forward, and if we write the conversion rules precisely then there is no need for the examples anyway. --- The conversion rules These are generally too vague. We really need *some* kind of formalism to clear this up, and make it easier to read. I think what we should do is come up with a simple convention for addressing into the TMDM, and switch over to a style like (this is the "More generally" rule from 3.3): To convert a topic item t to a resource: If t[subject locators] is not empty: choose an element r in t[subject locators] at random, and make it the resource. Add the following statement: (r, rdf:type, rdftm:InformationResource) For each l in t[subject locators] where l is not r add: (r, owl:sameAs, l) For each l in t[subject identifiers] add: (r, rdfm:subject-identifier, l) And so on. This is pretty rough, and can definitely be improved on, but it should do the trick of being specific enough, without being so much of a formalism that it makes the document hard to read. --- Untyped whatnots Throughout the document there are lots of references to untyped names, occurrences, and associations. These can all be removed, since there are no such things in Topic Maps. This means that there's no need to cater for these issues in any way, since they quite simply do not occur. I've pointed this out several times before, so I'm a bit puzzled as to why references to untyped constructs remain. --- 1.2 Para 1 talks about "authors" and "documents". I think we should lose both terms. Instances of Topic Maps and RDF are not necessarily documents, and the process by which they are created is rarely "authoring". I think we really mean to address owners of Topic Maps and/or RDF information. Para 1 also says the authors are ensured "a high level of interoperability", which is very vague. I think we can say what we ensure them: the conversion of data between the two models. Para 2 I think does not discuss the purpose of the document at all. It probably belongs in 1.1. I've lots more to say about this section, but it's really all to do with what the document actually *is*. The details are probably not worth writing down before we've resolved the issue of what it is we are creating. Para 5 probably shouldn't list the namespaces, nor describe them too much, as the result was pretty confusing for me. There should be a complete list of all defined/used properties in an appendix somewhere, and a reference to it here. (Producing the complete list is tricky; I'll return to that.) It would probably be beneficial to split 1.2 into: "Goals" and "Approach", or something like it, where "Goals" describes what we're trying to do, and "Approach" covers how we do it. --- 1.3 I think we should lose the "willingness" point in the first sentence; it goes without saying, and sounds very strange. The "authors" and "documents" thing recurs. The "creators of tools" are really implementors of the nameless translation mechanism specified in this document, are they not? The point about "people who seek assurance" I would delete. Yes, that is part of Ontopia's reason for participating in this work, but it's not really appropriate for the W3C to be saying this kind of thing. The second para is better lost, IMHO. --- 1.4 I think the prose repeat of the table of contents is better omitted. The remained could beneficially be turned into a "Notation" or "Conventions" section. --- 1.5 I think this should be deleted. The namespace URIs can be given in a table in the new 1.4, and the acronyms are better expaneded on first use, anyway. --- 2 This is so closely related to 1.2 that it might be better incorporated there. --- 2.1 Point 1: It should be made clear that this is *after* conversion. Point 5: Advice? Shouldn't that be guidelines? Point 7: We violate this point. --- 3 The title is misleading, and also in conflict with the first para. The first para is also misleading, I think, but in a different way. I suspect this is just incomplete editing. The second para says "guidance consists ... in asserting properties ... to be ... or to have ...". This is 100% RDF-centric, but shouldn't be. It's probably better to generalize and be less specific, by saying something like "guidance consists in annotation of ontology terms using, for the most part, the rdftm vocabulary". --- 3.1 I think we should cut most of this, maybe even all of it. Nit: the TM2RDF list seems very incomplete compared with the RDF2TM one. --- 3.2 I think we should cut this. Nits: the first para oversimplifies. --- 3.3 Again: I would cut most of this down to the big gray box. Para 4 says the document "advocates" a specific solution, but in fact it specifies one (or should). "The rules for translating identity are ...". I'm not sure this is the best way to describe this. These are actually the rules for converting between topics and x (where x is the missing term for "RDF nodes that are not literals"). I know this sounds horribly pedantic, but this is important, because we'll want to refer to these specific rules from pretty much everywhere in the rest of the document. TM2RDF: "When a topic": I would just lose this. Instead, make it an error in the steps that produce statements to use typing topics that have no subject identifier. Much simpler, and the result is the same. TM2RDF: The main rule can be simplified quite a lot. Note that it should be possible to retain item identifiers (as item identifiers). Otherwise TM2RDF translation will be once-only, and thus pretty much worthless in real life. TM2RDF: The rules say "(e.g., through the type of the resource being made an subclass of this class)", but this doesn't work. It's entirely possible for the resource to be an instance of a *supertype* of rdftm:InformationResource, and we can't know whether this is the case or not. Also, we have to say something definite about what to do here. I suggest simply cutting this. The note about "In the examples below" I can't make any sense. I'm not sure whose fault that is. Example 5 is a bit odd. Are item identifiers discarded or not? The RDF2TM part is not really very clear. Also, the first bullet point could be replaced with a statement in the OWL ontology for the RDFTM vocabulary that says that the rdfs:Property class and rdftm:InformationResource are mutually exclusive. (I forget the exact property used for this in OWL.) We need to discuss formalism issues to really settle that point. --- What vocabulary do we use? Each subsection of section 3 that has an RDF2TM/TM2RDF box contains a list of the vocabulary terms used in that section. I think that's way too repetitious, and that we should instead do this for the entire guidance vocabulary in a single list. We're just going to mess up otherwise, and n separate lists of terms add nothing useful, anyway. However, there is a deeper issue here, since some of the translation rules depend on the types of resources in RDF. The question is: how do you judge whether X is an instance of Y or not *when doing a conversion*? According to the RDF semantics X is an instance of Y in the following model: (X, _foo, Z) (_foo, rdfs:range, Y) If you use OWL this can be about as complicated as you wish. We need to come up with a story on this point. --- 3.4 Let's not repeat built-in guidance here. Let's list all the built-in guidance somewhere, and be done with it. TM2RDF: Untyped names don't exist. --- 3.4.1 Para 1 is tutorialism. TM2RDF: This should be made more precise, and easier to understand. Also, do we really want the rdftm:variant-scope property? I think I know why it's there, but it's really hard to make sure. RDF2TM: Here it seems to be rdftm:scope, which is inconsistent with example 19. --- 3.5 Para 1 is tutorialism, and untyped occurrences don't exist (para 3 + TM2RDF box). TM2RDF: "The value of the occurrence...": This oversimplifies a bit. This would be much easier if written more formally, as I suggest above. It would then run as: An occurrence item o is converted into (with implicit "topic-to-X- conversion"): (o[parent], o[type], o[value]) /* well, not quite */ (o[type], rdf:type, rdftm:OccurrenceProperty) --- 3.6 All of this (up to 3.6.1) is tutorialism. --- 3.6.1 Everything up to the TM2RDF box is tutorialism. The reference to untyped associations should go, both before the box and inside it. TM2RDF: I've noted that this is incomplete, but can't remember what I was referring to. We should rewrite this with the formalism, anyway, so it's not that important. Example 24: The two lines of guidance are very confusing. What are they? RDF2TM: The point about "inverse statements" is not necessary, since duplicates in TMDM are not possible, anyway. --- 3.6.1.1 and 3.6.1.2 We should be able to lose these to sections completely, and instead replace them with built-in guidance. --- 3.6.2 If we purge the document of tutorialism we can merge this with 3.6.1. As it is, this is just enormously much more voluminous than what is really needed. For this reason I haven't reviewed it properly; it's just obviously too much. --- 3.6.3 The same applies here. --- 3.7 Para 1 is tutorialism. Para 2 is tutorialism; replace with a constraint on RDF2TM name conversion. TM2RDF: This is fine for us, but needs to be formalized for when we go real. Likewise for RDF2TM. --- 3.8 and 3.9 The overall approach seems fine, as far as I'm able to follow it. However, these should be simplified, and then worked into the name, occurrence, and association sections. As the document stands now it's much harder for the reader to see how this fits together. This of course implies that it's much harder for us to make sure that it actually does fit together, too. The relationship between scope and reification isn't really described anywhere now, and that definitely needs to be taken care of. --- 3.10 I don't think we should have this section at all. The document is already way too long. --- 4 This doesn't really distinguish between providing guidance for conversion and guidelines for how to structure your information so that you won't run into trouble when you want to convert it. This really needs to be reconsidered, I think. --- 5 We should make an OWL ontology for the rdftm vocabulary. That could serve as both definition of the terms and a complete enumeration of them, as well as the built-in guidance. Point 4 in 5.3 is very odd, and needs to be looked at more closely. --- 6.2 Point 1: The first point is just an error; should we list it? Points 3-5: The three untyped points should go. Point 6: Reified roles work in n-aries. Point 7: Reified TMs might be made to work. Point 8: Topics that have no identifiers cannot occur, so that point can go. Point 9: This is true, but not really an "unsupported construct". Point 10: This is again just an error. --- 8 LTM is now in version 1.3. TMDM should be referenced as ISO 13250-2, and editors should not be included when referencing ISO standards. -- Lars Marius Garshol, Ontopian http://www.ontopia.net +47 98 21 55 50 http://www.garshol.priv.no
Received on Monday, 24 April 2006 08:29:38 UTC