Re: Review of RIF in RDF

Hi Axel,

On Tue, 2010-09-14 at 16:13 +0100, Axel Polleres wrote: 
> I hope I'll get a closer look at this within the next week or two.

Do we have that much time? :)

> Can you sketch the problem cases for extracting RIF from RDF?

First note I'm explicitly talking about *subtraction* not extraction. 

You can extract the rules from a RIF-in-RDF document. However, the
previous definition in section 8 required you to also subtract precisely
those triples which make up the rules and include only the remainder.
That's the bit I'm sceptical of and the algorithm for it is certainly
not currently given in the document. Instead, having the whole graph
(including any embedded rules but not including any rif:usedWithProfile
triples) be the input graph seems reasonable, indeed preferable.

The problems with subtraction are metadata, inference and extensions.

RIF allows completely arbitrary metadata embedded in a rule, RIF-in-RDF
allows for processors to convert this metadata to RDF assertions. This
means that a RIF rule set can give rise to triples that are not
obviously part of a RIF "tree". 

[It is true that the encoding will also encode the meta elements so you
can determine triples that *could* have been asserted as a result of
those meta elements but that doesn't handle clashes.]

Then you might have triples that are derived from rules, including the
arbitrary metadata, by inference. The RIF-in-RDF document explicitly
talks about round tripping through inference. You would have to define
the range of possible inferred triples you also want to subtract and
there is the issue of clashes.

Finally, the IRIs used in the RIF rules may also be used in the graph
data so there's no such thing as a separate rooted RIF tree. If you
allow for arbitrary RIF extensions then you can't tell the difference
between a statement that represents an extension and a statement that is
part of the separate graph.

Now you might be able to get around all this with enough extra
constraints and ignoring clashes but not doing subtraction seems
simpler, easier to understand and more powerful. It allows for rules
which transform themselves (e.g. to record the rules used as provenance
information in the results graph).

I think we need a really compelling use case for requiring subtraction
before it is worth attempting to define it. As I understand it the
motivation for having subtraction was to allow you to combine an RDF
graph with a RIF rule set in a SPARQL setting without having the RIF
rule set encoding "contaminate" the graph. However, it seems to me the
normal usage in that case would be to reference an external ruleset from
the graph with a simple use of rif:usedWithProfile and the modified
proposal supports that (along with the suppression of the simple
rif:usedWithProfile triples).

> While I agree that it will not be possible in general to extract RIF from some RDF graph using the RIF vocabulatry in an arbitrary way,
> I guess/assume/hope, we can come up with some well-definedness conditions that would ensure that an RDF graph allows such extraction, but which wouldn't be 
> limiting except the usage of the RDF vocabulary.
> 
> I hope we do agree that the RIF-in-RDF encoding needs to be 

This seems off topic from the specific issue of the semantics of
rif:usedWithProfile ...

> 1) rounddrippable

Yes for some value of "round trip". The current document is clear that
some features which don't affect the semantics the encoded rule sets are
not preserved (e.g. inessential ordering).

> 2) Stable against graph merges with other RIF in RDF encodings (where I had assumed that such graph merge should mean that you merge the two rulesets combined 

That is explicitly ruled out by the current proposal. Sandro's
requirement you can do arbitrary reduction to a subgraph and know
whether you have the complete rule set or not means that the current
representation uses lists to give a locally closed world. So to combine
two rule sets you will need to merge lists, very simple to do but more
than graph union.

> I was also assuming that 
> 
> 3) stable against graph merges with graphs not using the rif vocabulary

In the face of arbitrary RIF extensions what is "the rif vocabulary" ?

Dave

> was a requirement,   
> 
> Axel
> 
> On 13 Sep 2010, at 20:16, Dave Reynolds wrote:
> 
> > [Resend from a different account. My earlier send didn't seem to get through. Dave]
> > 
> > Closing action-1049
> > 
> > ** Summary:
> > - Sections 7/8 need some fixing up
> > - one (minor) bug in the table
> > - a number of editorial ('@@') comments need deleting or acting on,
> >   along with some other minor cleanup suggestions
> > 
> > ** Sections 7/8:
> > 
> > The split of the description of rif:usedWithProfile across two sections
> > doesn't work very well. There is not quite enough in section 7 to
> > explain how to use the property.
> > 
> > More seriously the description in section 8 assumes the ability to
> > extract an encoded rule set from an arbitrary graph and subtract just
> > those triples involved in the encoding. However, I don't believe such a
> > triple-removal algorithm is possible (given arbitrary metadata embedding
> > in rules and the optional translation of such metadata to RDF) and
> > certainly isn't included in the document. It would require a lot more
> > than XTr (which itself is an item of future work and not defined in the
> > doc).
> > 
> > In addition it seems like the intent is to also allow a graph to contain
> > both a RIF rule set and some RDF data it is to be applied to but you
> > have to import a different ruleset in order to trigger this behaviour
> > which is odd.
> > 
> > There are also still some '@@' remarks left in those sections.
> > 
> > Suggested rewrite(s) attached at [2] below. Note, necessarily this has
> > slightly different semantics to what was in there already.
> > 
> > ** Bug:
> > 
> > o Section 5.3, table 1 the entry for the second row is broken. The
> > 'focus_node rdf:type rif:Const' in the left hand column presumably
> > shouldn't be there and the right hand column should presumably be
> > 'focus_node rif:constIRI <value>' not ' ... "value"'
> > 
> > ** Minor:
> > 
> > o Section 2, UC4, phrasing glitch
> >   s/if the systems, if the extensions/, provided that the extensions/
> > 
> > o Section 3, Req6. The phrase "viewed as triples, there should be no
> > indication of which features are in which dialects or extensions;" seems
> > unnecessary and is not met by the design. The namespace of the
> > property/class will give such an "indication" but is not harmful.
> > Suggest dropping the quoted phrase or replacing with something like "the
> > RDF representation of extensions should follow the same structure as the
> > standard dialects".
> > 
> > o Section 4. Incomplete reference [@@SOAP]
> > 
> > o Section 4. Editorial marker "@@@ Update example"
> > [The example certainly is confusing, especially for a reader who hasn't
> > yet read the remainder of the document. Illustrating with syntax
> > snippets would help. Presumably minimum would be to drop the editorial
> > marker and leave the example as is - not ideal but not a major problem.]
> > 
> > o Section 5. Suggest dropping the final paragraph referencing OWL 2
> > mapping. I don't see how reading about a different mapping, for a
> > different language, defined using a different notation, is supposed to
> > help :)
> > 
> > o Section 5. Editorial comment "@@ consider merging tables".
> > Suggest just deleting this comment. The tables are fine as they are.
> > 
> > o Section 6 has more "@@" comments and Editor's note than it has content
> > and needs cleaning before publication. Suggested text [1].
> > 
> > Other:
> > o The mention of "incomplete running of transformation rules" (Section 3
> > Req4 and Section 4) still grates :)  I'd find it less distracting if
> > those mentions were removed but that doesn't change the substance of the
> > document so don't formally object to them remaining.
> > 
> > o The discussion on handling of <meta> in sections 5.2 and 5.3 is OK but
> > could be clearer if illustrated with a concrete example. I realize that
> > may be more work than there is available time.
> > 
> > Dave
> > 
> > [1] Suggested text for section 6.
> > 
> > [[[
> > Because the above mapping function Tr is not injective (one-to-one), the
> > inverse mapping is not a function, but provides many outputs for each
> > input. Intuitively, Tr loses information, such as the order in which
> > property elements occurred in the RIF XML document, so properly
> > reconstructing a RIF XML document requires additional information.
> > 
> > It is possible to define a reverse mapping XTr which is constrained to
> > produce only schema-valid RIF XML documents.  Given a RIF-XML document D
> > and a round-trip transformed document D'
> > 
> >     D' = Xtr( Tr(D), XML-schema, XML-root-element )
> > 
> > Then D' will not be identical to D, due to reordering and restructuring.
> > However, the semantics of D will be preserved in that the rule sets
> > corresponding to D and D' will have identical entailments under the
> > relevant RIF Dialect semantics.
> > 
> > Editor's Note: In a future version of this document the details of XTr
> > will be given, along with clarification of round-tripping guarantees.
> > ]]]
> > 
> > 
> > [2] Suggested text for sections 7/8
> > 
> > [[[
> > 7. Importing RIF into RDF
> > 
> > SWC [RIF RDF+OWL] defines the entailments of combinations (R, G) where R
> > (a RIF rule set) includes an import of G (an RDF graph).
> > 
> > We hereby define an RDF predicate <code>rif:usedWithProfile</code> which
> > enables an import to be specified from the graph G instead of from R.
> > 
> > In the simple usage the graph G is a plain RDF graph and
> > <code>rif:usedWithProfile</code> is used to combine that graph with one
> > or more externally defined RIF rule sets. In this usage each subject of
> > a <code>rif:usedWithProfile</code> assertion should be the URI for a RIF
> > rule set (which may be encoded in RIF-XML or RIF-in-RDF) and the object
> > should be an import profile as defined in SWC [RIF RDF+OWL].
> > 
> > It is also possible for the graph G to itself contain both an encoded
> > ruleset along with additional RDF statements to which the ruleset is
> > intended to apply. This usage is supported by using a blank node as the
> > subject of a <code>rif:usedWithProfile</code> assertion.
> > 
> > 8. Semantics of RIF in RDF
> > 
> > A RIF-in-RDF-aware processor shall treat any RDF graph G as a RIF-RDF or
> > RIF-OWL combination [RIF RDF+OWL] as follows:
> > 
> > Let G' be the graph obtained from G by removing all triples with
> > predicate rif:usedWithProfile.
> > 
> > Let R be either the empty RIF rule set or, if G contains a
> > rif:usedWithProfile triple with a blank subject node, then let R be the
> > rule set obtained by inverting the Tr mapping to extract any rules
> > encoded within G.
> > 
> > Then G is to be treated by a RIF-in-RDF-aware processor as the ruleset
> > R' by amending R  with the following  imports directives:
> > 
> >     Imports(R1)
> >     ...
> >     Imports(Rn)
> >     Imports(G' P1)
> >     ...
> >     Imports(G' Pn)
> > 
> > Where Ri and Pi are the subjects/objects respectively of triples of
> > form:
> >     Ri rif:usedWithProfile Pi     where Ri is a URI reference
> > 
> > Remark 1: Note that the fact that G' is treated as being imported with
> > all profiles P1 ... Pn enforces G' to be treated according to the
> > highest profiles among P1 ... Pn, cf. Section 5.2 of [RIF RDF+OWL].
> > 
> > Remark 2: Note that if G includes an encoded RIF rule set then the
> > triples that make up that encoding are visible in the (R', G')
> > combination.
> > 
> > Remark 3: If the graph G can be obtained from URI Ug then including the
> > triple  
> >      [] rif:usedWithProfile P .
> > is equivalent to including:
> >     <Ug> rif:usedWithProfile P .
> > I.e. self including the graph as a rule set has the same effect as
> > explicitly including the same graph as an external ruleset.
> > 
> > Remark 4: Note the discussion in the section 6 that the inversion of Tr
> > is not a deterministic function.
> > ]]]
> > 
> > Editor's Note: The support for self-inclusion by using a blank node may
> > be controversial. If so then drop that in order to get to publication.
> > In that case the suggested text is:
> > 
> > [[[
> > 7. Importing RIF into RDF
> > 
> > SWC [RIF RDF+OWL] defines the entailments of combinations (R, G) where R
> > (a RIF rule set) includes an import of G (an RDF graph).
> > 
> > We hereby define an RDF predicate <code>rif:usedWithProfile</code> which
> > enables an import to be specified from the graph G instead of from R.
> > 
> > Each subject of a <code>rif:usedWithProfile</code> assertion should be
> > the URI for an externally defined RIF rule set (which may be encoded in
> > RIF-XML or RIF-in-RDF) and the object should be an import profile as
> > defined in SWC [RIF RDF+OWL].
> > 
> > 8. Semantics of RIF in RDF
> > 
> > A RIF-in-RDF-aware processor shall treat any RDF graph G as a RIF-RDF or
> > RIF-OWL combination [RIF RDF+OWL] as follows:
> > 
> > Let G' be the graph obtained from G by removing all triples with
> > predicate rif:usedWithProfile.
> > 
> > Then G is to be treated by a RIF-in-RDF-aware processor as the ruleset
> > R:
> >    Document (
> >     Imports(R1)
> >     ...
> >     Imports(Rn)
> >     Imports(G' P1)
> >     ...
> >     Imports(G' Pn)
> >    )
> > 
> > Where Ri and Pi are the subjects/objects respectively of triples of
> > form:
> >     Ri rif:usedWithProfile Pi
> > 
> > Remark 1: Note that the fact that G' is treated as being imported with
> > all profiles P1 ... Pn enforces G' to be treated according to the
> > highest profiles among P1 ... Pn, cf. Section 5.2 of [RIF RDF+OWL].
> > 
> > Remark 2: Note that if G itself includes a RIF ruleset encoded as
> > RIF-in-RDF then no special additional processing is performed, those
> > encoded rules are not included in R. If the Graph G is available from
> > some URI Ug then it is possible for G to reference itself as a rule set
> > by including the triple:
> >     <Ug> rif:usedWithProfile P .
> > 
> > Remark 3: Note the discussion in the section 6 that the inversion of Tr
> > is not a deterministic function.
> > ]]]
> > 
> > 
> > 
> > 
> > 
> 

Received on Tuesday, 14 September 2010 16:58:44 UTC