Re: [ISSUE-37] New proposal on RIF interoperation with XML data and XML Schemas from Christian de Sainte Marie on 2009-03-17 (public-rif-wg@w3.org from March 2009)

From: Christian de Sainte Marie <csma@ilog.fr>
Date: Tue, 17 Mar 2009 13:47:55 +0100
To: kifer@cs.sunysb.edu
CC: RIF WG <public-rif-wg@w3.org>
Message-ID: <49BF9BFB.8040700@ilog.fr>
Michael Kifer wrote:
>>
>>I concluded that, if your schema is done like that, then those
>>elements/element names are unlikely to represent classes; and your rules are
>>unlikely to test objects for membership in such non-classes;
> 
> 
> I am not sure this is a good assumption. It feels wrong to me. The different
> instances of Name in my example are just different classes. I may chose
> to use the same name "because I can" (pardon my plagiarizing of Bill
> Clinton :-).

</chair>
I may not have been clear: of course, you can choose to use the same names in your schema. And even if all your elements have the same names, that is not a problem, and you can still use frame to "navigate" the data document along the child axis.

My point was that I had pondered whether it was worth adding new syntax, e.g. based on XPath absolute location paths, to permit a more discriminating select of sets of elements using the class membership atom.

And that I decided to use only element or type names as class identifiers, not based on the assumption that nobody would use elements with non-unique names as classes, but based on the assumption that it would fit 80% of the use cases, and that the remaining 20% would have to use frames to refine selections along the "child" axis

My latest proposal seems so much simpler, that it seemed worth the trade-off...

> Well, then maybe a better way would be to just provide a builtin method, which
> would take an XPath expression and return? (We do allow function symbols in
> Core, if they are used as built-ins.)


Yes, that was my initial proposal [1]. But importing the schema that represents the data model that is assumed in the representation of the condition elements seems a simpler and more natural approach. That does not stop us from using XPath expressions, as fragments in IRIs: I have a earlier version where I tried just that, but it had other problems, see my reply to Dave [2] (maybe I should publish that version too; or would it just add to the entropy?).

What would be the benefit of using builtins, in your opinion?


> It might be, if it can be worked out elegantly.


You do not mean to suggest that what I propose is not elegant, do you?

:-)

> Gary's proposal, on the other
> hand, is a sure thing, but it is a brute-force approach. We have something very
> similar to that in FLORA-2. I find it a bit too low-level, but it is usable.


Yes, the benefit of Gary's proposal is that it is very straightforward, and probably easy to implement. We may want to consider changing the designation of schema elements, as suggested by Dave, but that would be a "detail".

On the other hand, I explored how we could deal uniformally with the schema and schema-less cases, by basing my proposal on the structure of the instance XML document rather than on the XML schema. The bottom-line is when the structure is given only by the instance document itself; or it can be interchanged as a DTD; or the structure and much more can be interchanged as schema. But the representation and semantics of the condition elements is independent on which is the case.

Another issue is the relation with SWC and RDF XML: a RIF document that import an RDF graph or OWL ontology should be the same, whether the graph is imported as an RDF XML document or other wise. Or, at least, there should be a canonical form of the RDF XML document for which they are the same.

I do not say that it works with my proposal: I did not check it thoroughly (I would not be able to, anyway).

But a few simple tests made me believe that it might be amenable to work. e.g., assuming that the identifier of an object that is serialized, in RDF XML, as an instance element that contains an rdf:about attribute, is the value of that attribute, the couple normalizing rules below seem to work already pretty well:
- normalizing the RDF XML document to group all the rdf:Description elements about the same resource in to a single element;
- moving the value of rdf:resource attribute to the content of the element;
- dealing with rdf:type and the fact that we may want the type both as the element name (instead of rdf:Description) and as the content of an rdf:type sub-element is more tricky, but normalizing RDF XML to use only rdf:Description elements with rdf:type sub-element, arther than the shortcut of using the QName of the type as the head element for the description, and extending the semantics of o # QName to select the rdf:Description elements that contain an rdf:type sub-element, provided that the QName extpand to the content of the rdf:type element does it;
- etc.

For instance, the test case RDF_Combination_Member_1 [1] works, assuming the normalized RDF XML representation of the imported RDF graph as follows:

<rdf:RDF xmlns:ex="http://example.org/example#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
   <rdf:Description rdf:about="http://example.org/example#i">
      <rdf:type>
          http://example.org/example#A
      </rdf:type>
   </rdf:Description>
</rdf:RDF>
 
I tried a few other test cases, such as http://www.w3.org/2005/rules/wiki/UCR_4.7a, and they work as well.

Cheers,

Christian
<chair> 

[1] http://www.w3.org/2005/rules/wiki/index.php?title=RIF%2BXML_data-schema&oldid=7389
[2] http://lists.w3.org/Archives/Public/public-rif-wg/2009Mar/0082.html
Received on Tuesday, 17 March 2009 12:48:58 UTC