- From: Sandro Hawke <sandro@w3.org>
- Date: Sun, 25 Jul 2010 19:03:52 -0400
- To: kifer@cs.stonybrook.edu
- Cc: public-rif-wg <public-rif-wg@w3.org>
On Sun, 2010-07-25 at 18:52 -0400, Michael Kifer wrote: > pls see within. > > On Sun, 25 Jul 2010 18:46:02 -0400 > Sandro Hawke <sandro@w3.org> wrote: > > > On Sun, 2010-07-25 at 17:57 -0400, Michael Kifer wrote: > > > Sandro, > > > I don't understand your argument. So, you are proposing that somebody would > > > explicitly write > > > > > > foo[my:conjuncts->list(bar1 bar2 bar3)] > > > > > > If so, why can't the same one write this: > > > > > > foo[my:conjuncts->bar1] > > > foo[my:conjuncts->bar2] > > > foo[my:conjuncts->bar3] > > > > > > ? > > > > The key point is that after the translation rules are done, we need to > > have inferred something like: > > > > foo[rif:formulas->list(bar1 bar2 bar3)] > > > > instead of > > > > foo[rif:formula->bar1] > > foo[rif:formula->bar2] > > foo[rif:formula->bar3] > > > > And we need this, because otherwise we don't know when to stop looking > > for more solutions. With the first form, we can stop as soon as we get > > the first match (in the context of a large matching process, trying to > > find a match for rif:Document). > > > > In the second form, however, it's not clear when to stop. We will only > > know we have all the appropriate values for rif:formula when a complete > > reasoner has run until termination. But I think many RIF reasoners will > > not be complete, and I think many sets of fallback rules will produce > > lots of unwanted solutions, perhaps never terminating. > > > > If we can stop after finding the first solution, or the first few good > > ones (as in the list style), we're okay -- if we have to find them all > > before knowing what a correct translation looks like, I think that will > > be a problem. > > If you can infer foo[rif:formulas->list(bar1 bar2 bar3)] at all then you can > infer > > foo[rif:formula->bar1] > foo[rif:formula->bar2] > foo[rif:formula->bar3] > > AND terminate. You are chasing a non-issue, I think. In theory you could infer the second form, under certain condition. I don't think those conditions will usually hold, in practice. Please see my message (posted just after you posted this) to Dave. I think it mostly addresses this. Specifically, the list form is usable even without reasoning terminating, and I think in practice reasoning will often not terminate quickly. -- Sandro > michael > > > > > > (Obviously, I'm thinking in terms of a backward-chaining BLD system > > here, trying to extract a rif:Document. I don't understand termination > > conditions in PRD well enough to know how to handle this, there, or if > > it's even possible.) > > > > -- Sandro > > > > > michael > > > > > > > > > > > > On Sun, 25 Jul 2010 16:49:52 -0400 > > > Sandro Hawke <sandro@w3.org> wrote: > > > > > > > Dave [1], Harold [2], and Michael [3] have all expressed a desire to > > > > have the RIF-in-RDF mapping more closely follow the XML syntax. In > > > > particular, they suggest it use repeated properties instead of > > > > gathering all the values of the properties into a list. > > > > > > > > I'm extremely sympathetic to this desire. If you look back at the > > > > history of the web page, you'll see this is what my first version did, > > > > and then I stalled out for months as I realized it wouldn't work. > > > > Eventually I decided I just had to go ahead with the list-based > > > > approach that's currently in the document. > > > > > > > > The compelling problem for me is that using repeated properties, as > > > > far as I know, it is not possible to reliably transform a RIF document > > > > using an incomplete reasoner. I've called this "Requirement 4" in > > > > RIF-in-RDF [4]. > > > > > > > > Let me back up and explain what I'm trying to do and why I think it's > > > > important. > > > > > > > > In my talks and writing about RIF to Semantic Web audiences, I explain > > > > that where I think RIF is essential is in data transformation. With > > > > RIF, we can allow interoperation between vocabularies. My standard > > > > example is that FOAF has a foaf:name property, and it also has > > > > foaf:firstName and foaf:lastName. When you're producing FOAF data, > > > > which should you use? When you're consuming FOAF data, which should > > > > you look for? In both cases, if you want interoperability, you have > > > > to do both. When there are only two options, and everyone knows about > > > > them, that's okay. But what happens when the third, fourth, and fifth > > > > "standard" properties for representing names comes along? It's a > > > > nightmare; the fact that the producer and consumer are both using RDF > > > > ends up not buying you very much at all. > > > > > > > > But RIF can solve this problem. By having the ontology documents for > > > > each of terms include some RIF (via rif:importWithProfile), the folks > > > > deploying new properties can express how they map data to alternative > > > > properties. (In this case, with some string operations.) Now, > > > > data-consuming systems which implement RIF can automatically get the > > > > data in exactly the vocabulary they want. > > > > > > > > I think this is a very compelling use case. In fact, without this > > > > mechanism (or an equivalent one) I don't see how the Semantic Web can > > > > work at all. More recently, I've started using another example (which > > > > I mentioned on a recent telecon), where facebook's Open Graph Protocol > > > > uses RDF with a different style of modeling than most of the Semantic > > > > Web; here, again, RIF can provide interoperability via translation > > > > rules. > > > > > > > > Now, imagine we have this all in place. Lots of RDF data out there, > > > > using various vocabularies. When you dereference the terms you find > > > > some RIF that lets you translate between them, so it's all roughly > > > > interoperable. Of course, not every vocabulary can be mapped; some > > > > aren't well understood enough to formalize, etc. But many can be > > > > translated. This allows new vocabularies to be deployed, and the > > > > overall system to grow and evolve in place. > > > > > > > > Now, remember the RIF extensibility requirement? In the current > > > > design, we met it by providing may-ignore and must-understand > > > > extensions via annotations and new xml elements. This works, but only > > > > in very broad strokes. We have no "graceful" fallback. Extensions > > > > can't offer syntactic sugar, and they certainly can't offer features > > > > which can be approximated. This mechanism may not be good enough to > > > > allow extensions to really be deployed on the open Web. We talked > > > > about all this years ago, but decided we didn't have time to work out > > > > all the details, and that it could wait. > > > > > > > > So, as you may have guessed by now, I want to provide RIF > > > > extensibility the same way I want to provide FOAF name extensibility: > > > > with RIF translation (fallback) rules. > > > > > > > > I'll walk through this, below, but here's the punchline: I think it > > > > works fine with the list-style of RIF-in-RDF, but I don't think it can > > > > be done with the repeated-properties style. This is why I need the > > > > lists. > > > > > > > > I have a few ideas of transformations I want right now... > > > > > > > > - automatically add universal quantification to free variables > > > > - extend frames to allow for context/named-graphs (cf Decker's TRIPLE) > > > > - convert some kinds of rules between PRD and BLD (trading off > > > > between new() and logic functions) > > > > - convert logic functions to builtin list operations (I think this > > > > can be done; not sure) getting more of BLD into Core > > > > - standard rewritings: get rid of conjunction in rule heads, disjunction > > > > in rule bodies, Skolemize > > > > - re-write out named-argument-uniterms > > > > > > > > ... but they're all too complex to use as first illustrations. For > > > > that I'll use something that ridiculous, but pleasantly simple: > > > > > > > > - Allow people to use the term my:Conjunction instead of rif:And. Also, > > > > use my:conjunct instead of rif:formula inside it. > > > > > > > > Before actually writing the transformation rule, we have to decide > > > > what the transformations are going to look like in RIF. Some options: > > > > > > > > 1. in place, new and old, overlapping; the new data (the output) > > > > is distinguished by using different properties and/or classes. > > > > 2. copy the whole document, with changes > > > > 3. ... maybe some other approaches? > > > > > > > > Let's try (1) first, since it's more terse. Our input looks like > > > > this: > > > > > > > > ... > > > > <if> <!-- or something else that can have an And in it --> > > > > <my:Conjunction> > > > > <my:conjunct>$1</my:conjunct> > > > > <my:conjunct>$2</my:conjunct> > > > > ... > > > > </my:Conjunction> > > > > </if> > > > > ... > > > > > > > > and we'll just "replace" the element names. > > > > > > > > However, since we don't have a way to "replace" things in this > > > > "overlapping" style, we'll just add a second <if> property, and the > > > > serializer or consumer will discard this one, since it contains an > > > > element not allowed by the dialect syntax. > > > > > > > > So, the rule will add new triples, but leave the old ones intact. > > > > The rule will leave us with this: > > > > > > > > > > > > ... > > > > <if> <!-- or something else that can have an And in it --> > > > > <my:Conjunction> > > > > <my:conjunct>$1</my:conjunct> > > > > <my:conjunct>$2</my:conjunct> > > > > ... > > > > </my:Conjunction> > > > > </if> > > > > <if> <!-- the same property, whatever it was --> > > > > <And> > > > > <formula>$1</formula> > > > > <formula>$2</formula> > > > > ... > > > > </And> > > > > </if> > > > > ... > > > > > > > > Here's the rule: > > > > > > > > forall ?parent ?prop ?old ?conjunct ?new > > > > if And( > > > > ?parent[?prop->?old] > > > > my:Conjunction#?old[my:conjunct->?conjunct] > > > > ?new = wrapped(?old) <!-- use a logic function to create a new node --> > > > > ) then And ( > > > > ?parent[?prop->?new] > > > > rif:And#?new[rif:formula->?conjunct] > > > > ) > > > > > > > > This works fine, as long as the reasoning is complete. However, if > > > > the reasoning is ever incomplete, we end up with undetectably > > > > incorrect results. Rules that were "if and(a b c) then d" might get > > > > turned into "if and(a b) then d"! > > > > > > > > I don't think it's sensible to expect reasoners to be complete. It's > > > > great to have termination conditions arise from the rules; it's not > > > > good to require the reasoner to run until it knows all possible > > > > inferences have been made. With the above approach, there's no > > > > termination condition other than "make all the inferences possible". > > > > > > > > Alternatively, if we use the list encoding, the rule is very similar: > > > > > > > > forall ?parent ?prop ?old ?conjuncts ?new > > > > if And( > > > > ?parent[?prop->?old] > > > > my:Conjunction#?old[my:conjuncts->?conjuncts] > > > > ?new = wrapped(?old) > > > > ) then And ( > > > > ?parent[?prop->?new] > > > > rif:And#?new[rif:formulas->?conjuncts] > > > > ) > > > > > > > > ... but now we can set a termination condition: if a RIF document in > > > > the desired dialect *can* be extracted, then you're done. > > > > > > > > A few notes: > > > > > > > > * I've included the types (like rif:And) for now. Whether to do > > > > that is a separate issue (specifically ISSUE-101). > > > > > > > > * It's okay to have the rules produce multiple valid RIF > > > > documents; you can stop after generating one, but you can also > > > > continue. If there's some kind of weighting on the rules (cf > > > > XTAN's "impact" mechanism) you can search for a solution that's > > > > better than some others. It may be possible to efficiently > > > > direct this search towards the best solution; I'm not sure. > > > > > > > > * I don't think the copy-the-whole-document approach to > > > > translation helps at all. There, instead of attaching the new > > > > node to the same parent, we attach it to a new parent, and we > > > > end up with a whole new tree. But still, branches of the tree > > > > are generated by separate rules applications, so an incomplete > > > > reasoner may produce incomplete (wrong) output trees. > > > > > > > > I think that's it. I trust y'all will point out any confusing or > > > > incorrect elements of this argument. > > > > > > > > -- Sandro > > > > > > > > [1] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0015 > > > > [2] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0017 > > > > [3] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0018 > > > > [4] http://www.w3.org/2005/rules/wiki/RIF_In_RDF#Requirements > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Received on Sunday, 25 July 2010 23:04:03 UTC