- From: Sandro Hawke <sandro@w3.org>
- Date: Sun, 25 Jul 2010 16:49:52 -0400
- To: public-rif-wg <public-rif-wg@w3.org>
Dave [1], Harold [2], and Michael [3] have all expressed a desire to have the RIF-in-RDF mapping more closely follow the XML syntax. In particular, they suggest it use repeated properties instead of gathering all the values of the properties into a list. I'm extremely sympathetic to this desire. If you look back at the history of the web page, you'll see this is what my first version did, and then I stalled out for months as I realized it wouldn't work. Eventually I decided I just had to go ahead with the list-based approach that's currently in the document. The compelling problem for me is that using repeated properties, as far as I know, it is not possible to reliably transform a RIF document using an incomplete reasoner. I've called this "Requirement 4" in RIF-in-RDF [4]. Let me back up and explain what I'm trying to do and why I think it's important. In my talks and writing about RIF to Semantic Web audiences, I explain that where I think RIF is essential is in data transformation. With RIF, we can allow interoperation between vocabularies. My standard example is that FOAF has a foaf:name property, and it also has foaf:firstName and foaf:lastName. When you're producing FOAF data, which should you use? When you're consuming FOAF data, which should you look for? In both cases, if you want interoperability, you have to do both. When there are only two options, and everyone knows about them, that's okay. But what happens when the third, fourth, and fifth "standard" properties for representing names comes along? It's a nightmare; the fact that the producer and consumer are both using RDF ends up not buying you very much at all. But RIF can solve this problem. By having the ontology documents for each of terms include some RIF (via rif:importWithProfile), the folks deploying new properties can express how they map data to alternative properties. (In this case, with some string operations.) Now, data-consuming systems which implement RIF can automatically get the data in exactly the vocabulary they want. I think this is a very compelling use case. In fact, without this mechanism (or an equivalent one) I don't see how the Semantic Web can work at all. More recently, I've started using another example (which I mentioned on a recent telecon), where facebook's Open Graph Protocol uses RDF with a different style of modeling than most of the Semantic Web; here, again, RIF can provide interoperability via translation rules. Now, imagine we have this all in place. Lots of RDF data out there, using various vocabularies. When you dereference the terms you find some RIF that lets you translate between them, so it's all roughly interoperable. Of course, not every vocabulary can be mapped; some aren't well understood enough to formalize, etc. But many can be translated. This allows new vocabularies to be deployed, and the overall system to grow and evolve in place. Now, remember the RIF extensibility requirement? In the current design, we met it by providing may-ignore and must-understand extensions via annotations and new xml elements. This works, but only in very broad strokes. We have no "graceful" fallback. Extensions can't offer syntactic sugar, and they certainly can't offer features which can be approximated. This mechanism may not be good enough to allow extensions to really be deployed on the open Web. We talked about all this years ago, but decided we didn't have time to work out all the details, and that it could wait. So, as you may have guessed by now, I want to provide RIF extensibility the same way I want to provide FOAF name extensibility: with RIF translation (fallback) rules. I'll walk through this, below, but here's the punchline: I think it works fine with the list-style of RIF-in-RDF, but I don't think it can be done with the repeated-properties style. This is why I need the lists. I have a few ideas of transformations I want right now... - automatically add universal quantification to free variables - extend frames to allow for context/named-graphs (cf Decker's TRIPLE) - convert some kinds of rules between PRD and BLD (trading off between new() and logic functions) - convert logic functions to builtin list operations (I think this can be done; not sure) getting more of BLD into Core - standard rewritings: get rid of conjunction in rule heads, disjunction in rule bodies, Skolemize - re-write out named-argument-uniterms ... but they're all too complex to use as first illustrations. For that I'll use something that ridiculous, but pleasantly simple: - Allow people to use the term my:Conjunction instead of rif:And. Also, use my:conjunct instead of rif:formula inside it. Before actually writing the transformation rule, we have to decide what the transformations are going to look like in RIF. Some options: 1. in place, new and old, overlapping; the new data (the output) is distinguished by using different properties and/or classes. 2. copy the whole document, with changes 3. ... maybe some other approaches? Let's try (1) first, since it's more terse. Our input looks like this: ... <if> <!-- or something else that can have an And in it --> <my:Conjunction> <my:conjunct>$1</my:conjunct> <my:conjunct>$2</my:conjunct> ... </my:Conjunction> </if> ... and we'll just "replace" the element names. However, since we don't have a way to "replace" things in this "overlapping" style, we'll just add a second <if> property, and the serializer or consumer will discard this one, since it contains an element not allowed by the dialect syntax. So, the rule will add new triples, but leave the old ones intact. The rule will leave us with this: ... <if> <!-- or something else that can have an And in it --> <my:Conjunction> <my:conjunct>$1</my:conjunct> <my:conjunct>$2</my:conjunct> ... </my:Conjunction> </if> <if> <!-- the same property, whatever it was --> <And> <formula>$1</formula> <formula>$2</formula> ... </And> </if> ... Here's the rule: forall ?parent ?prop ?old ?conjunct ?new if And( ?parent[?prop->?old] my:Conjunction#?old[my:conjunct->?conjunct] ?new = wrapped(?old) <!-- use a logic function to create a new node --> ) then And ( ?parent[?prop->?new] rif:And#?new[rif:formula->?conjunct] ) This works fine, as long as the reasoning is complete. However, if the reasoning is ever incomplete, we end up with undetectably incorrect results. Rules that were "if and(a b c) then d" might get turned into "if and(a b) then d"! I don't think it's sensible to expect reasoners to be complete. It's great to have termination conditions arise from the rules; it's not good to require the reasoner to run until it knows all possible inferences have been made. With the above approach, there's no termination condition other than "make all the inferences possible". Alternatively, if we use the list encoding, the rule is very similar: forall ?parent ?prop ?old ?conjuncts ?new if And( ?parent[?prop->?old] my:Conjunction#?old[my:conjuncts->?conjuncts] ?new = wrapped(?old) ) then And ( ?parent[?prop->?new] rif:And#?new[rif:formulas->?conjuncts] ) ... but now we can set a termination condition: if a RIF document in the desired dialect *can* be extracted, then you're done. A few notes: * I've included the types (like rif:And) for now. Whether to do that is a separate issue (specifically ISSUE-101). * It's okay to have the rules produce multiple valid RIF documents; you can stop after generating one, but you can also continue. If there's some kind of weighting on the rules (cf XTAN's "impact" mechanism) you can search for a solution that's better than some others. It may be possible to efficiently direct this search towards the best solution; I'm not sure. * I don't think the copy-the-whole-document approach to translation helps at all. There, instead of attaching the new node to the same parent, we attach it to a new parent, and we end up with a whole new tree. But still, branches of the tree are generated by separate rules applications, so an incomplete reasoner may produce incomplete (wrong) output trees. I think that's it. I trust y'all will point out any confusing or incorrect elements of this argument. -- Sandro [1] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0015 [2] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0017 [3] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0018 [4] http://www.w3.org/2005/rules/wiki/RIF_In_RDF#Requirements
Received on Sunday, 25 July 2010 20:50:03 UTC