Re: Review of XML-DATA from Dave Reynolds on 2009-09-28 (public-rif-wg@w3.org from September 2009)

From: Dave Reynolds <der@hplb.hpl.hp.com>
Date: Mon, 28 Sep 2009 21:19:03 +0100
To: Christian De Sainte Marie <csma@fr.ibm.com>
CC: RIF WG <public-rif-wg@w3.org>
Message-ID: <4AC11A37.9090403@hplb.hpl.hp.com>
Christian De Sainte Marie wrote:
> 
> Dave,
> 
> 
> Dave Reynolds wrote on 26/09/2009 14:11:29:
>  >
>  > Overall this version is much better than the previous and, with two
>  > exceptions and some editorial tweaks, is OK to publish as a working 
> draft.
> 
> Thanx for the review. Comments and discussion/proposed resolution on 
> your issues are inlined below.
> 
>  > The first exception is Section 4.1.1 on DM-Name. I don't think we should
>  > define a DM-Name for all rif constants via casting to xs-string. Only
>  > rif:iri constants should have a DM-Name, using the same algorithm you
>  > give here.  I see no advantage in being able to treat, e.g., a string as
>  > corresponding to an XML information item.
> 
> In principle, I agree. The problem is that rif:iri must be absolute 
> IRIs, as you pointed out in your initial review.
> 
> Thus, rif:iri constant cannot be used for the case where an 
> element/attribute does not ave a namespace.

True.

> You suggested to use the base URI, in that case, but I have an issue 
> with that (besides the problem of mixing namespace URIs and base URIs in 
> the treatment): the base URI is specific to a document. What if my RIF 
> document imports several data sources, and I want to refer to an 
> information item across all these documents, and that information item's 
> name is defined without a namespace? 

I'd argue that without a namespace there is no reason to believe that 
those elements in those different documents correspond to each other and 
so you should not be able to refer to them uniformly.

For example I import two documents, one from http://www.foo.com/doc1.xml
   <root><child>42</child</root>

and one from http://www.bar.com/doc2.xml
   <root><child>0</child</root>

I'd argument that those two "root" elements are different, it is not 
that there is one "root" with two values for <child> but two different 
roots.

I don't see how allowing non-namespace-qualified names in RIF without 
having a means to relate them to which imported document you are 
referring to, make sense.

Though I agree using the base URI instead of have a proper module system 
is a hack.

> So, I agree that considering only rif:iri constants would be better, but 
> that would require a change in the definition of rif:iri, which would 
> raise a whole new batch of difficulties, making the solution potentially 
> worse than the problem it solves...

If we really do want to support non-namespace-qualified names I suppose 
you could use local names, though that goes against the spirit of local 
names rather.

> The definition of DM-Names, in the current draft, essentially allows 
> rif:iri constants and xs:NCName constants to have a DM-Name: would it be 
> better if the definition was restrcited to say exactly that?

I would find that a little preferable, thus forcing people to use 
rif:iri for things with namespaces.  Should at least add an explanation 
of the issue.

>  > The other non-editorial issue is section 4.2. The notion of the "domain
>  > of all variables" is not something that is meaningful in BLD, only (I
>  > assume) in PRD.
> 
> That section needs more work, indeed. But I thought that it was exactly 
> for the reverse reasons : it seems to me that the requirement about the 
> domain of variables makes better sense in a model-theoretic 
> specification of the semantics than in PRD's operational one.
> 
> Maybe it is only the wording? The point is that, in any interpretation, 
> Dind (that is, the domain that is used to interpret the variables) must 
> contain "the data model instances etc".

Sure but that is not specific to variables, for example IFrame will need 
to map the rif:iri constants representing XML elements to the associated 
child elements as sets of bags.  It's the phrasing "domain of variables" 
which seems PRD-like.

>  > I think you either need to make the embedding (when it
>  > is done) normative or provide a model-theoretic semantics for the
>  > RIF/XML combination in the same way that Jos did for SWC.
> 
> I do not want to make the embedding normative, because I believe that it 
> would be a burden on adoption if we required that conformant 
> implementation be actually able to produce the RIF facts from a data 
> source.

In principle you can pull any of the data out as RIF facts via rule 
conditions so you have to do that anyway - whether you materialize all 
those facts before being asked for them or lazily evaluate them is a 
small implementation detail :-)

But OK, I don't really think making the embedding normative is the way 
to go.

> Re providing a model-theoretic semantics for the RIF/XMLcombination, 
> what do you ean exactly (I mean, in addition to the definitions provided 
> in sections 4.3-4.6? Something like: a RIF+XML interpretation satisfies 
> the RIF+XML combination if it is a RIF model and it satisfies the 
> additional constaints set in section 4.3 to 4.6?

I mean something like section 3.2 in SWC. You would have an explicit 
section on the semantic structures which would, for example, say that 
ITruth(IFrame(o)(s,v)) would have to be t whenever I(o) is an 
Information Item in an imported document, and I(v) is in o/s-sequence or 
Ilist(v) o/s-sequence. [That may not be exactly right, but hopefully 
that makes sense as an approach.]

>  > However, I'd
>  > be happy for the document to be published in this state with just an
>  > Editor's Note explaining that that section will be reworked.
> 
> I agre with that solution: the semantics of combination needs also be 
> defined more precisely wrt the operational semantics of PRD, as it 
> essentially requires embedding, in the current form.

OK

>  > For the record, I don't agree with the Editor's note after example 4.6,
>  > equating a string to an integer is something I think we should avoid.
>  > However, that need not hold up publication as a working draft since it
>  > is already clearly marked as to be discussed.
> 
> The point is that, when the type of the (atomic) content of an 
> information item is unknown, its literal representation is returned as 
> an xs:string. So, the proposal is not to equate a string with an 
> integer, but to accept the equality of an integer and one of its literal 
> representation, provided the later is untyped.

But in RIF there is no notion of a "literal representation" with no 
type. You either say in the untyped case you get a string in which case 
leave it as string and don't equate it to an integer, or you have to 
change RIF's type system to allow for typed literals with an unknown 
type.  My can-of-worms-avoidance-alarm eliminates the latter.

> So, the solution might be to leave the [typed representation] property 
> empty when there is no typing information, and introduce a [literal 
> representation] property for that purpose?

I don't see how that will help. You still have to say how that literal 
representation is accessed from RIF conditions.

Dave
Received on Monday, 28 September 2009 20:19:53 UTC