Re: [RIF-XMLdata] Updated draft from Gary Hallmark on 2009-09-14 (public-rif-wg@w3.org from September 2009)

From: Gary Hallmark <gary.hallmark@oracle.com>
Date: Mon, 14 Sep 2009 12:14:18 -0700
To: Dave Reynolds <der@hplb.hpl.hp.com>
CC: Christian De Sainte Marie <csma@fr.ibm.com>, RIF <public-rif-wg@w3.org>
Message-ID: <4AAE960A.1070100@oracle.com>
Here's my review, which complements Dave's, I think.

Overall, I am bothered by the difference in style as compared to SWC. 
Both attempt to define the semantics of combinations. SWC does so by 
embedding RDF/OWL in RIF. XMLdata, on the other hand, attempts to 
introduce some notions into RIF such as static and dynamic expression 
evaluation contexts, nodes and node sequences, available document, 
available collection, and so forth, into RIF. This causes several problems:
1. the procedural xpath/xquery semantics doesn't work when combined with 
the model theory of BLD/Core
2. it forces the reader to try to comprehend both RIF and xpath/xquery 
concepts
I think it would be clearer to simply define how an xml document, 
optionally with a schema, is to be mapped to frame, membership, 
subclass, etc. formulas. Then the semantics of the combination is given 
by RIF, and one does not have to understand the semantics of 
xpath/xquery, only the mapping to RIF and the RIF semantics.

I support the notion of a dynamic context if it gives us the ability to 
support a get-current-date() builtin. I asked about this at the last 
f-2-f but it didn't fly. The reason was that it doesn't make sense in 
our model theory. The theory has no notion of a time-varying context. So 
a ruleset with get-current-date() has no semantics. That's not a huge 
problem -- one simply treats such things as current-date as a Const that 
must be supplied "at the last minute" to give the ruleset meaning. I 
think we can also treat
dynamically imported datasources the same way. I.e. there is no such 
thing, one can import an xml document with or without schema. If you 
want a dynamic datasource then you have to serialize it in xml "at the 
last minute" and reference it from an Import.

I do not believe "dynamically imported datasources" are well-enough 
specified to be interoperable. E.g. in example 3.2 there is not enough 
information to automatically and unambiguously convert the customer 
table to an xml document. Where did the xml:lang tags come from? Not the 
database. What if the database has FIRSTNAME and LASTNAME columns? How 
are they to be marshalled into the Name element?

The notion of dynamic context should be removed, or elevated to a level 
where we can add get-current-date() to DTB.

The notion of sequence is fundamental to XDM. These are ordered and 
flat. Sometimes we may want to map them to frame slots (but then order 
is lost), other times we may want to map then to Lists (but Lists aren't 
flat). I would like to see some way to control the mapping, for cases 
when order matters. Also, for some use cases, it may be desirable to 
embed simple xml documents as nested positional terms, e.g. 
Customer(name("john") account(111))

In example 3.2 it seems that rules can act upon the xml document to set 
John's ID to 111. That seems too strong, especially for Core/BLD. I 
think the entailment holds, but I don't think entailment necessarily 
generates new XML nodes. Probably we need some PRD-specific action 
(export-xml or something?) for this.

Why overload Import profile by allowing it to contain a schema location? 
Why not have 2 profiles: xml-data and xml-schema-data, where the second 
takes an additional 2 locations, one for the data document, and one for 
the schema document?

Can a schema-less xml document convey type information, e.g. <Account 
xsi:type="xs:integer">111</Account>?

In 4.1, XDM-Name is defined as a mapping from rif:iri Consts, but the 
next to last bullet shows a mapping from an xs:string Const.



Dave Reynolds wrote:
> Here's a review.  I'm just off a transatlantic flight and somewhat jet 
> lagged so apologies for any typos or incoherence ...
>
> Dave
>
> ** Substantial issues
>
> 1. The notion of updating the XML tree of the available document in 
> the dynamic environment seems flawed and should be removed. It is only 
> specified, and only specifiable, for PRD in any case. Surely rules 
> such as the rule given in example 3.2 simply entail additional frame 
> formulae (i.e. then go in working memory in PRD). The underlying 
> document, and so its node tree in the dynamic context, should not change.
>
> 1.b Given this can the whole notion of static and dynamic contexts be 
> dropped?
>
> 2. The notion of URIs without namespaces in the XDM-name definitions 
> in section 4.1 doesn't work as far as I can tell. In RIF a rif:iri 
> always corresponds to an absolute URI. It might sometimes be expressed 
> relative to a base URI and so require resolution but that is 
> irrelevant here - it   is still a full URI not a relative URI and so 
> will never have an empty namespace fragment.
>
> 3. The notion that other datatypes like strings can be coerced into 
> rif:uris is a new one on me and doesn't seemed to be specified here. 
> It seems unnecessary and I suggest dropping it.
>
> ** more minor issues
>
> o You mention in an editor's note that importing a RDF/OWL graph will 
> be considered in a future draft. That sounds worrying. What do you 
> mean? This should removed entirely for now. I don't want to see a 
> working draft suggest there is a different way to combine RIF and RDF 
> documents and so confuse things even further.
>
> o In Section 3, "Definition (Available collections)" you reference 
> "that string" without explaining what it is. Is that the URI of an 
> imported document? If so then say so.
>
> o The addition of an RDB to XML schema mapping complicates the 
> document. Is it necessary? Could this document not simply focus on the 
> XML mapping and leave RDB->XML mapping to others?
>
> o In Example 3.2 the XML serialization of the RDB contains "en" and 
> "fr" xml:lang codes but there is no indication of where these codes 
> come from. This serialization is then used later on. I prefer to drop 
> the RDB altogether in which case this serialization becomes 
> standalone. If you keep the RDB notion then explain where the lang 
> tags come from.
>
> o Also in Example 3.2 the first sentence following the rule example is 
> wrong. The rule does not say what you say it says. It says that 
> Customer instances will have *an* Id matching its Account number. In 
> the case where a Customer had an Account number 333 and an Id 222 the 
> rule would add Id 333 leaving it with two Ids, it would not equate 222 
> and 333.
>
> o In section 4.1 you could have prefixes defined in the cases where 
> the namespace URI matches a prefix binding declared in the RIF 
> document. I don't know that would have any value.
>
> o Example 4.6, bullet 2. I don't think this holds in the schemaless 
> case because the 111 is not an xsd:integer in that case. I don't agree 
> with the associated editor's note. How can you possibly change the 
> definition of frames to allow strings to match integers?
>
> o Example 3.6, bullet 4. The type of "en" is not xml:lang, that is not 
> a type. Perhaps you mean xs:language but in any case you might just as 
> well use xs:string.
>
> o Example 4.7. The variable names don't match up. Perhaps you mean:
> ... ?v["letterBody"->?y] will be true if and only if ?y is ...
>
> ** editorial
>
> * Section 1
>
> s/Followingly, this/This/
> s/possible, the corresponding XML schemas/where available the 
> corresponding XML schemas,/
>
> * Section 3
>
> Last bullet of example 3.1 is odd, what is "A model that is intended 
> for a RIF-BLD document"? Suggest just deleting that bullet.
>
> First sentence of first para after definition of available documents 
> in a dynamic context ("If a dynamically associated ...") repeats 
> information already given a couple times - drop it.
>
> I can't parse the second sentence "If the RIF document does not 
> contains an Import directive". Rephrase.
>
> s/with a the location/with the location/
> s/There are no constraint/There are no constraints/
> s/formulascontained/formulas contained/
>
> In example 3 why is one version of the import in presentation syntax 
> and one in XML syntax? Change first to XML syntax for consistency.
>
> * Section 4.1.1
>
> s/attribut nodes/attribute nodes/
>
> * Section 4.1.2
>
> s/URI in slot's XDM-Name/URI in class's XDM-Name/
>
> * Section 4.1.3
>
> s/available documents (that is/available documents, that is/
>
> s/interpretation of class identifiers/interpretation of slot identifiers/
>
> s/on attrbute/on attribute/
>
> * Section 4.2.4
>
> s/false in th/false in the/
>
> Example 4.6, bullet 4 has a broken <tt/> tag.
>
> Example 4.7,  s/see [Section 6.6 Element Nodes/see [Section 6.2 
> Element Nodes/
>
> Christian De Sainte Marie wrote:
>>
>> Gary, Dave,
>>
>> I updated again the draft: added examples, covering the schema-less 
>> case, frames, and general XML (mixed-content). I think that I am done 
>> re examples.
>>
>> I also corrected an incorrectness in the interpretation of frames.
>>
>> The main point that remains to be done is the handling of ID and 
>> IDREFs. And add in the intro some explanation about why this does not 
>> require an implementation of the XDM.
>>
>> And take all your nasty comments into account, of course :-)
>>
>> Cheers,
>>
>> Christian
>>
>> ILOG, an IBM Company
>> 9 rue de Verdun
>> 94253 - Gentilly cedex - FRANCE
>> Tel. +33 1 49 08 35 00
>> Fax +33 1 49 08 35 10
>>
>>
>> Sauf indication contraire ci-dessus:/ Unless stated otherwise above:
>> Compagnie IBM France
>> Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 
>> 92400 Courbevoie
>> RCS Nanterre 552 118 465
>> Forme Sociale : S.A.S.
>> Capital Social : 609.751.783,30 €
>> SIREN/SIRET : 552 118 465 02430
>>
>
>
Received on Monday, 14 September 2009 19:16:28 UTC