Re: RIF-in-RDF: Requirement 4 [Switching to KR Argument]

On Sun, 2010-07-25 at 19:57 -0400, Michael Kifer wrote:
> On Sun, 25 Jul 2010 19:03:52 -0400
> Sandro Hawke <sandro@w3.org> wrote:
> 
> > On Sun, 2010-07-25 at 18:52 -0400, Michael Kifer wrote:
> > > pls see within.
> > >
> > > On Sun, 25 Jul 2010 18:46:02 -0400
> > > Sandro Hawke <sandro@w3.org> wrote:
> > >
> > > > On Sun, 2010-07-25 at 17:57 -0400, Michael Kifer wrote:
> > > > > Sandro,
> > > > > I don't understand your argument. So, you are proposing that somebody would
> > > > > explicitly write
> > > > >
> > > > > foo[my:conjuncts->list(bar1 bar2 bar3)]
> > > > >
> > > > > If so, why can't the same one write this:
> > > > >
> > > > > foo[my:conjuncts->bar1]
> > > > > foo[my:conjuncts->bar2]
> > > > > foo[my:conjuncts->bar3]
> > > > >
> > > > > ?
> > > >
> > > > The key point is that after the translation rules are done, we need to
> > > > have inferred something like:
> > > >
> > > >    foo[rif:formulas->list(bar1 bar2 bar3)]
> > > >
> > > > instead of
> > > >
> > > >    foo[rif:formula->bar1]
> > > >    foo[rif:formula->bar2]
> > > >    foo[rif:formula->bar3]
> > > >
> > > > And we need this, because otherwise we don't know when to stop looking
> > > > for more solutions.   With the first form, we can stop as soon as we get
> > > > the first match (in the context of a large matching process, trying to
> > > > find a match for rif:Document).
> > > >
> > > > In the second form, however, it's not clear when to stop.  We will only
> > > > know we have all the appropriate values for rif:formula when a complete
> > > > reasoner has run until termination.  But I think many RIF reasoners will
> > > > not be complete, and I think many sets of fallback rules will produce
> > > > lots of unwanted solutions, perhaps never terminating.
> > > >
> > > > If we can stop after finding the first solution, or the first few good
> > > > ones (as in the list style), we're okay -- if we have to find them all
> > > > before knowing what a correct translation looks like, I think that will
> > > > be a problem.
> > >
> > > If you can infer foo[rif:formulas->list(bar1 bar2 bar3)] at all then you can
> > > infer
> > >
> > >    foo[rif:formula->bar1]
> > >    foo[rif:formula->bar2]
> > >    foo[rif:formula->bar3]
> > >
> > > AND terminate. You are chasing a non-issue, I think.
> > 
> > In theory you could infer the second form, under certain condition.  I
> > don't think those conditions will usually hold, in practice.
> > 
> > Please see my message (posted just after you posted this) to Dave.  I
> > think it mostly addresses this.
> 
> Yes, but what you wrote to Dave does not make sense.
> As he observed correctly, you are lumping together the issues of the direction
> of chaining and of termination. Termination or non-termination is not inherent
> to either method.
> 
> > Specifically, the list form is usable even without reasoning
> > terminating, 
> 
> and so is the above form that does not use lists. The two forms are isomorphic
> -- also reasoning-wise.
> 
> To infer ->list(bar1 bar2 bar3), somebody must already know IN ADVANCE that
> bar1,bar2, and bar3 are the three answers, and he must then write that down
> explicitly. In that case, that same one would be able to write down the above 3
> facts. There is nothing more to it.
> 
> > and I think in practice reasoning will often not terminate
> > quickly.
> 
> Seriously, I don't think this is a serious statement. Seriously :)

I think we're misunderstanding each other.   It's possible I'm just
wrong (as you seem to think), but I don't think so.   I can try to
refine my language, but it's not clear to me how to do that right now.

So let me try a completely different line of argument for a minute.
Please try to read between the lines of my poor command of the language
of philosophy and formal logic:

  1. RDF is a KR.  An RDF graph is a logical statement; it makes claims
about some world.

  2. If someone makes a claim by stating RDF Graph G, they are implying
all the claims made by all the graphs entailed by G.

  3. While there may be some debate about which logics (and therefore
which entailments) are appropriate/standard, RDF Simple Entailment
certainly is.  So whenever you say something as an RDF Graph G, you are
implying all the claims of all the subgraphs of G.

  4. Assume we have the RIF rule, a&b&c=>d, and it's encoded in graph G
as node R. This means the agent stating G is claiming that R says
a&b&c=>d.

  5. If we use the repeated-properties mapping for that step-4 encoding,
then there will be subgraphs of G which describe R as saying a&b=>d,
a&c=>d, a=>d, etc.  

  6. If the agent stating G is also asserting R, then (by #5) the agent
is also implying a&b=>d, a&c=>d, c=>d, etc.    If c is true, the agent
will have implied d.   Consumers trusting that agent may justifiably
infer d.  If it turns out a or b was false, this conclusion is wrong.  
So, this is bad.

  7. With the list-style mapping, the only subgraphs of G in which R
encodes a RIF Document are those in which that encoded document has the
same RIF meaning.  This is because the list-style mapping is essentially
fragile; all the non-trivial subgraphs simply don't describe a RIF
document.   So we avoid #6 badness.

I'm sorry I don't know how to say this in the proper mathematical
language.  I hope it still makes sense.  Let me know which steps I need
to expand on/clarify.

     -- Sandro



> 
> > > >
> > > > (Obviously, I'm thinking in terms of a backward-chaining BLD system
> > > > here, trying to extract a rif:Document.   I don't understand termination
> > > > conditions in PRD well enough to know how to handle this, there, or if
> > > > it's even possible.)
> > > >
> > > >     -- Sandro
> > > >
> > > > > michael
> > > > >
> > > > >
> > > > >
> > > > > On Sun, 25 Jul 2010 16:49:52 -0400
> > > > > Sandro Hawke <sandro@w3.org> wrote:
> > > > >
> > > > > > Dave [1], Harold [2], and Michael [3] have all expressed a desire to
> > > > > > have the RIF-in-RDF mapping more closely follow the XML syntax.  In
> > > > > > particular, they suggest it use repeated properties instead of
> > > > > > gathering all the values of the properties into a list.
> > > > > >
> > > > > > I'm extremely sympathetic to this desire.  If you look back at the
> > > > > > history of the web page, you'll see this is what my first version did,
> > > > > > and then I stalled out for months as I realized it wouldn't work.
> > > > > > Eventually I decided I just had to go ahead with the list-based
> > > > > > approach that's currently in the document.
> > > > > >
> > > > > > The compelling problem for me is that using repeated properties, as
> > > > > > far as I know, it is not possible to reliably transform a RIF document
> > > > > > using an incomplete reasoner.  I've called this "Requirement 4" in
> > > > > > RIF-in-RDF [4].
> > > > > >
> > > > > > Let me back up and explain what I'm trying to do and why I think it's
> > > > > > important.
> > > > > >
> > > > > > In my talks and writing about RIF to Semantic Web audiences, I explain
> > > > > > that where I think RIF is essential is in data transformation.  With
> > > > > > RIF, we can allow interoperation between vocabularies.  My standard
> > > > > > example is that FOAF has a foaf:name property, and it also has
> > > > > > foaf:firstName and foaf:lastName.  When you're producing FOAF data,
> > > > > > which should you use?  When you're consuming FOAF data, which should
> > > > > > you look for?  In both cases, if you want interoperability, you have
> > > > > > to do both.  When there are only two options, and everyone knows about
> > > > > > them, that's okay.  But what happens when the third, fourth, and fifth
> > > > > > "standard" properties for representing names comes along?  It's a
> > > > > > nightmare; the fact that the producer and consumer are both using RDF
> > > > > > ends up not buying you very much at all.
> > > > > >
> > > > > > But RIF can solve this problem.  By having the ontology documents for
> > > > > > each of terms include some RIF (via rif:importWithProfile), the folks
> > > > > > deploying new properties can express how they map data to alternative
> > > > > > properties.  (In this case, with some string operations.)  Now,
> > > > > > data-consuming systems which implement RIF can automatically get the
> > > > > > data in exactly the vocabulary they want.
> > > > > >
> > > > > > I think this is a very compelling use case.  In fact, without this
> > > > > > mechanism (or an equivalent one) I don't see how the Semantic Web can
> > > > > > work at all.  More recently, I've started using another example (which
> > > > > > I mentioned on a recent telecon), where facebook's Open Graph Protocol
> > > > > > uses RDF with a different style of modeling than most of the Semantic
> > > > > > Web; here, again, RIF can provide interoperability via translation
> > > > > > rules.
> > > > > >
> > > > > > Now, imagine we have this all in place.  Lots of RDF data out there,
> > > > > > using various vocabularies.  When you dereference the terms you find
> > > > > > some RIF that lets you translate between them, so it's all roughly
> > > > > > interoperable.  Of course, not every vocabulary can be mapped; some
> > > > > > aren't well understood enough to formalize, etc.  But many can be
> > > > > > translated.  This allows new vocabularies to be deployed, and the
> > > > > > overall system to grow and evolve in place.
> > > > > >
> > > > > > Now, remember the RIF extensibility requirement?  In the current
> > > > > > design, we met it by providing may-ignore and must-understand
> > > > > > extensions via annotations and new xml elements.  This works, but only
> > > > > > in very broad strokes.  We have no "graceful" fallback.  Extensions
> > > > > > can't offer syntactic sugar, and they certainly can't offer features
> > > > > > which can be approximated.  This mechanism may not be good enough to
> > > > > > allow extensions to really be deployed on the open Web.  We talked
> > > > > > about all this years ago, but decided we didn't have time to work out
> > > > > > all the details, and that it could wait.
> > > > > >
> > > > > > So, as you may have guessed by now, I want to provide RIF
> > > > > > extensibility the same way I want to provide FOAF name extensibility:
> > > > > > with RIF translation (fallback) rules.
> > > > > >
> > > > > > I'll walk through this, below, but here's the punchline: I think it
> > > > > > works fine with the list-style of RIF-in-RDF, but I don't think it can
> > > > > > be done with the repeated-properties style.  This is why I need the
> > > > > > lists.
> > > > > >
> > > > > > I have a few ideas of transformations I want right now...
> > > > > >
> > > > > >   - automatically add universal quantification to free variables
> > > > > >   - extend frames to allow for context/named-graphs (cf Decker's TRIPLE)
> > > > > >   - convert some kinds of rules between PRD and BLD (trading off
> > > > > >     between new() and logic functions)
> > > > > >   - convert logic functions to builtin list operations (I think this
> > > > > >     can be done; not sure) getting more of BLD into Core
> > > > > >   - standard rewritings: get rid of conjunction in rule heads, disjunction
> > > > > >     in rule bodies, Skolemize
> > > > > >   - re-write out named-argument-uniterms
> > > > > >
> > > > > > ... but they're all too complex to use as first illustrations.  For
> > > > > > that I'll use something that ridiculous, but pleasantly simple:
> > > > > >
> > > > > >   - Allow people to use the term my:Conjunction instead of rif:And.   Also,
> > > > > >     use my:conjunct instead of rif:formula inside it.
> > > > > >
> > > > > > Before actually writing the transformation rule, we have to decide
> > > > > > what the transformations are going to look like in RIF.   Some options:
> > > > > >
> > > > > >    1.  in place, new and old, overlapping; the new data (the output)
> > > > > >        is distinguished by using different properties and/or classes.
> > > > > >    2.  copy the whole document, with changes
> > > > > >    3.  ...   maybe some other approaches?
> > > > > >
> > > > > > Let's try (1) first, since it's more terse.  Our input looks like
> > > > > > this:
> > > > > >
> > > > > >       ...
> > > > > >       <if>       <!-- or something else that can have an And in it -->
> > > > > >          <my:Conjunction>
> > > > > >              <my:conjunct>$1</my:conjunct>
> > > > > >              <my:conjunct>$2</my:conjunct>
> > > > > >              ...
> > > > > >          </my:Conjunction>
> > > > > >       </if>
> > > > > >       ...
> > > > > >
> > > > > > and we'll just "replace" the element names.
> > > > > >
> > > > > > However, since we don't have a way to "replace" things in this
> > > > > > "overlapping" style, we'll just add a second <if> property, and the
> > > > > > serializer or consumer will discard this one, since it contains an
> > > > > > element not allowed by the dialect syntax.
> > > > > >
> > > > > > So, the rule will add new triples, but leave the old ones intact.
> > > > > > The rule will leave us with this:
> > > > > >
> > > > > >
> > > > > >       ...
> > > > > >       <if>       <!-- or something else that can have an And in it -->
> > > > > >          <my:Conjunction>
> > > > > >              <my:conjunct>$1</my:conjunct>
> > > > > >              <my:conjunct>$2</my:conjunct>
> > > > > >              ...
> > > > > >          </my:Conjunction>
> > > > > >       </if>
> > > > > >       <if>      <!-- the same property, whatever it was -->
> > > > > >          <And>
> > > > > >              <formula>$1</formula>
> > > > > >              <formula>$2</formula>
> > > > > >              ...
> > > > > >          </And>
> > > > > >       </if>
> > > > > >       ...
> > > > > >
> > > > > > Here's the rule:
> > > > > >
> > > > > >  forall ?parent ?prop ?old ?conjunct ?new
> > > > > >  if And(
> > > > > >    ?parent[?prop->?old]
> > > > > >    my:Conjunction#?old[my:conjunct->?conjunct]
> > > > > >    ?new = wrapped(?old)  <!-- use a logic function to create a new node -->
> > > > > >  ) then And (
> > > > > >    ?parent[?prop->?new]
> > > > > >    rif:And#?new[rif:formula->?conjunct]
> > > > > >  )
> > > > > >
> > > > > > This works fine, as long as the reasoning is complete.  However, if
> > > > > > the reasoning is ever incomplete, we end up with undetectably
> > > > > > incorrect results.  Rules that were "if and(a b c) then d" might get
> > > > > > turned into "if and(a b) then d"!
> > > > > >
> > > > > > I don't think it's sensible to expect reasoners to be complete.  It's
> > > > > > great to have termination conditions arise from the rules; it's not
> > > > > > good to require the reasoner to run until it knows all possible
> > > > > > inferences have been made.  With the above approach, there's no
> > > > > > termination condition other than "make all the inferences possible".
> > > > > >
> > > > > > Alternatively, if we use the list encoding, the rule is very similar:
> > > > > >
> > > > > >  forall ?parent ?prop ?old ?conjuncts ?new
> > > > > >  if And(
> > > > > >    ?parent[?prop->?old]
> > > > > >    my:Conjunction#?old[my:conjuncts->?conjuncts]
> > > > > >    ?new = wrapped(?old)
> > > > > >  ) then And (
> > > > > >    ?parent[?prop->?new]
> > > > > >    rif:And#?new[rif:formulas->?conjuncts]
> > > > > >  )
> > > > > >
> > > > > > ... but now we can set a termination condition: if a RIF document in
> > > > > > the desired dialect *can* be extracted, then you're done.
> > > > > >
> > > > > > A few notes:
> > > > > >
> > > > > >     * I've included the types (like rif:And) for now.  Whether to do
> > > > > >       that is a separate issue (specifically ISSUE-101).
> > > > > >
> > > > > >     * It's okay to have the rules produce multiple valid RIF
> > > > > >       documents; you can stop after generating one, but you can also
> > > > > >       continue.  If there's some kind of weighting on the rules (cf
> > > > > >       XTAN's "impact" mechanism) you can search for a solution that's
> > > > > >       better than some others.  It may be possible to efficiently
> > > > > >       direct this search towards the best solution; I'm not sure.
> > > > > >
> > > > > >     * I don't think the copy-the-whole-document approach to
> > > > > >       translation helps at all.  There, instead of attaching the new
> > > > > >       node to the same parent, we attach it to a new parent, and we
> > > > > >       end up with a whole new tree.  But still, branches of the tree
> > > > > >       are generated by separate rules applications, so an incomplete
> > > > > >       reasoner may produce incomplete (wrong) output trees.
> > > > > >
> > > > > > I think that's it.  I trust y'all will point out any confusing or
> > > > > > incorrect elements of this argument.
> > > > > >
> > > > > >       -- Sandro
> > > > > >
> > > > > > [1] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0015
> > > > > > [2] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0017
> > > > > > [3] http://lists.w3.org/Archives/Public/public-rif-wg/2010Jul/0018
> > > > > > [4] http://www.w3.org/2005/rules/wiki/RIF_In_RDF#Requirements
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > 
> > 
> 

Received on Monday, 26 July 2010 02:11:03 UTC