Re: POWDER formal semantics 4.3 comments from Jonathan Rees on 2008-12-01 (public-powderwg@w3.org from December 2008)

From: Jonathan Rees <jar@creativecommons.org>
Date: Sun, 30 Nov 2008 20:44:33 -0500
To: Stasinos Konstantopoulos <konstant@iit.demokritos.gr>
Cc: public-powderwg@w3.org, Alan Ruttenberg <alanruttenberg@gmail.com>, Michael Schneider <schneid@fzi.de>
Message-Id: <A8EB46CA-9A5A-4C54-B39C-88C72CF4AF3B@creativecommons.org>
On Nov 26, 2008, at 1:33 PM, Stasinos Konstantopoulos wrote:

> Dear Jonathan,
>
> thank you for reviewing the POWDER formal doc and for your helpful
> comments. Responses inlined, please read on.
>
> On Wed Nov 19 13:46:19 2008 Jonathan Rees said:
>
>> 1. I couldn't find a specification of the particular regular  
>> expression
>> syntax used (e.g. perl5). (Obviously this would be done by  
>> reference, not
>> inclusion.) A citation in section 4.3 would be especially helpful.
>
> This shall be taken care of, thank you for your comment.
>
>> 2. Regarding the use of XMLLiteral in section 4.3, an OWL WG chair  
>> tells
>> me: "There's an issue in that it is not certain that XMLLiteral  
>> will be a
>> supported datatype in OWL 2 DL. It will be in OWL 2 with RDF  
>> Semantics.
>> http://lists.w3.org/Archives/Public/public-owl-wg/2008Nov/0110.html
>> If you think that will be a mistake, please send mail to public-owl-
>> comments. Soon."
>
> This does not influence the formal doc which only uses XMLLiteral
> at the RDF level before OWL vocabulary is interpreted. It does,
> however influence POWDER as a transport layer for XML meta-data
> (DR doc sect. 2.8.2) because if OWL abolishes XMLLiteral,
> <owl:hasValue> will not permit XML as a value (as in, for example,
> Ex 2-15, lines 36--41).
>
> We shall contact OWL-WG, thank you for bringing this to our
> attention.
>
>> 3. Regarding "wdrs:matchesregex rdfs:domain rdfs:Resource" -- this  
>> won't
>> work in OWL DL. If you care about DL you should replace rdfs:Resource
>> with owl:Thing. I don't think anything will suffer much for doing so.
>
> This does, in fact, work because
>  wdrs:matchesregex rdf:type owl:DatatypeProperty
>  wdrs:notmatchesregex rdf:type owl:DatatypeProperty
> implicitly restrict the domain to owl:Thing.
>
>> 4. IEXT is not defined; I think you should cite the document that
>> defines it. It appears you mean the 2004 RDF Semantics recommendation
>> http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ .  I see that you  
>> cite it
>> in 4.6 but it is needed in 4.3.
>
> This shall be taken care of, thank you for your comment.
>
>> 5. If you mean for POWDER to work with OWL-DL, have you had anyone
>> review the POWDER semantics with an eye to interaction with OWL-DL  
>> model
>> theoretic semantics? Would you consider saying something about this  
>> in
>> your document, since OWL-DL semantics differs from RDF semantics?
>
> Several people have reviewed the document, so far not raising any
> objections on this issue. Is there some particular problem you have in
> mind?

I am most concerned about (a) whether quantifying over the domain of  
IR is good mathematics (b) whether the formal semantics will  
faithfully respect the informal semantics that you wish to capture. I  
don't think I am up to the task of checking these, so this is just  
vague uneasiness (speaking as a paranoid engineer who knows just  
enough mathematics to cause trouble), not a problem report.

>> 6. "equivalence relation" has a technical meaning in mathematics  
>> and I
>> don't think it's what you mean here. I think that if you just say
>> "relation" you will convey the right thing.
>
> We didn't expect to get this reading from anybody, especially since it
> is immediately obvious that this relation is not an "equivalence
> relation" in the theoretical-algebra sense. It is used in the looser  
> sense of
> the final note of Section 5.1 of "RDF Concepts" [3].

Yes, it is pretty obvious, but still it took time for me to check that  
the term was being used in a way that was unfamiliar to me, as opposed  
to there being some kind of typo or other mistake. Changing the  
meaning of a widely used mathematical term in what is essentially a  
piece of mathematics seems unfortunate.

> On Thu Nov 20 12:06:14 2008 Jonathan Rees said:
>
>>
>> I'm diving a bit deeper into the relation between RDF formal  
>> semantics
>> [1] and POWDER formal semantics [2], and have found another glitch.
>>
>> POWDER says:
>>
>>          o uuu is in the domain of I, with I(uuu)=x
>>
>> Clearly uuu is meant to be an IRI. But RDF semantics says that the
>> "domain" of an interpretation I is a set of resources (somewhat
>> confusingly, since I is also used as a function that has a domain  
>> that
>> is syntactic). I think you mean for uuu to belong to V, the  
>> "vocabulary"
>> of I, which is the domain not of I but of IS:
>
> In this case, I is used as a function, as shown by its being
> applied to uuu in "I(uuu)".

Right, I should have read further into the RDF semantics rec. My  
mistake.

> This function has a domain or universe a non-emtpy set IR,
> which includes as a subset the literal vocabulary LV.
>
>> "A set of names is referred to as a vocabulary [V]."  ... "4. A  
>> mapping
>> IS from URI references in V into (IR union IP)" ... "if E is a URI
>> reference in V then I(E) = IS(E)"
>>
>> I couldn't find any particular restrictions on what V might be; it  
>> could
>> be empty, or the set of IRIs, or the set of IRIs occurring in the  
>> graph,
>> or anything else. I would guess that in applying an interpretation  
>> to a
>> graph, the name (IRI) set is meant to at least contain the  
>> vocabulary of
>> the graph (the IRIs) occurring in it, but it could be limited to it.
>
> There is also no restriction in the RDF doc (as far as I could tell)
> that the set of URIrefs be disjoint from LV. (NOTE that OWL-DL
> requires that owl:ThingS and LV are disjoint, but the domain of IS is
> the references, not the logical entities themselves). The POWDER
> extension implies that URIrefs are in LV, so that:
>
>  if E is a URI reference in V then I(E) = IS(E)
>
> will take us from an rdf:Literal to the resource, if the literal
> happens to be a well-formed URI reference; applying IL(E) otherwise.
>
> The semantic extension itself is needed to also be able to apply IL to
> the XMLLiteral so that a regexp can be applied to the actual string.
>
> Please comment.

I was commenting not so much on whether the domain of V contains  
things that aren't URIs, but rather on *which* set of URIs was the  
domain of IS. As for the first I trust that Pat dealt with it  
properly, but to me the interaction between entailment and URI  
matching is nonobvious. Saying nothing about the interaction is  
probably correct. It is left open whether the domain of IS contains  
only a few URIs or infinitely many of them. Perhaps explaining this  
would be more confusing than being silent.

The issue is that interpretations will lead to different conclusions  
depending solely on the choice of domain of IS. E.g. suppose that as  
axioms you have xa, xb, xc belonging to some class C. Can you conclude  
that all things having URIs of the form x* belong to C? This depends  
on whether there is another URI xd in the domain of IS interpreted to  
be a non-C. This could be true in some interpretations and not others.  
Now a prover will correctly conclude that the class of resources  
possessing URIs matching x* is not *necessarily* a subset of C, since  
the prover is testing entailment (all models). This seems subtle  
enough that it makes me worry, but I cannot point you to any flaw.

>> Now there are two problems with this. First, you want to talk about
>> IRIs, not URIrefs, right? That is, if the RDF graph contains the
>> relative URIref "a/b", you would prefer to match against the fully
>> resolved IRI, not the URIref, since otherwise the truth of a POWDER
>> graph would depend on choice of base IRI, which makes no sense. So  
>> you
>> need to have a story that accounts for the base URI (or other  
>> resolution
>> mechanism), or else arranges for all IRIs to be fully resolved by the
>> time they get to this point. Perhaps this is already taken care of,  
>> and
>> I'm just missing it.
>
> It is our understanding that URIrefs are by definition absolute and
> unique identifiers across any RDF document. (cf. RDF Semantics,
> Sect 1.2 [4]). Since RDF semantics does not restrict how this can
> be achieved, neither does the POWDER at the abstract, formal level.

I agree. Please disregard.

> At the operational level, one can easily imagine that certain
> normalization and canonicalization steps are required. These are
> specified in the DR and Grouping docs.
>
>> Second, the restriction of POWDER formal semantics to the IRIs that  
>> are
>> in the vocabulary of the interpretation (= domain of IS = V) will  
>> only
>> sometimes agree with the informal semantics that you are trying to
>> capture. Suppose a graph contains assertions that the resources  
>> named by
>> a/x, a/y, and a/z (imagine now these are IRIs) are green. If these  
>> are
>> the only IRIs of the form a/* occurring in V, one could conclude,
>> according to your semantics, that all the resources in the group  
>> defined
>> by IRI pattern a/* are green. But there might be another resource,  
>> a/w,
>> that is not green, but just happens to not be mentioned in this
>> particular graph. The formal semantics would agree with the informal
>> semantics only in the case that all resources with IRIs matching a/*
>> occur in the graph.
>
> I might be misunderstanding you here, but you seem to be introducing  
> an
> inductive step which is not present in either the operational or the
> formal semantics. If a POWDER document asserts that "a/*" are green,
> that is because its author makes this assertion based on their
> understanding and/or inspection of the real-world domain; the
> assertion holds regardless of which entities of "a/*" happen to be
> represented in any given RDF document.

I'm not suggesting induction, I'm just saying that, even without it,  
there will be models in which induction will *seem* to apply just  
because the domain of IS is so small. But to repeat the above, this  
may not matter since we are checking for entailment and a few rogue  
models that have accidental properties don't necessarily spoil the  
batch.

> Once again, for absolute clarity, POWDER documents DO NOT describe
> generalization or explanations processes; POWDER documents are
> purely deductive models, just as OWL is. If you have gotten a
> different impression, please point out the part of the text where
> you feel this impression is given so that we can clarify the text.
>
>> On the other hand, model theoretic semantics (for either RDF or OWL- 
>> DL)
>> might handle this well, since entailment is quantified over all  
>> possible
>> models, and at least one of these will include information about a/ 
>> w. But
>> the fact that some of your interpretations are wrong means that you  
>> will
>> get fewer entailments than you might otherwise like. It might be  
>> worth
>> the effort some time to restrict interpretations further.
>
> In the light of this paragraph, it seems possible that I
> misunderstood your previous paragraph, and what you are saying above
> is that we are asserting the *existence* of resources that match
> regexps, thus loosing perfectly valid interpretations of POWDER-RDF
> graphs. This is not the case since there is no POWDER/XML document
> that (using the prescribed transform) will generate OWL/RDF with
> existentially quantified assertions about owl:Thing instances; POWDER
> docs assert universally quantified implications (subsumptions) linking
> existentially-quantified nodes representing intermediate classes
> (irisets and descriptorsets); such classes must exist but may very
> well be empty. Furthermore the POWDER Processor semantics make it
> clear that one can only query a POWDER doc about the description of a
> named resource.


Hope the above explains the point better. An awful lot hinges on the  
choice of domain of IS. But I guess this is no different from the  
situation without URI matching - the addition of a new URI to the  
domain of IS, without the addition of new axioms, could easily lead to  
a significantly different theory, even without matching.

I guess I was hoping to save myself the trouble of working this  
through in detail, and that's why I wondered if the formal semantics  
had had independent review from someone familiar with OWL semantics.

A test of the correctness and tractability of the extension would be  
agreement from one of the groups providing DL reasoners to implement  
it. Has this been pursued?

Jonathan
Received on Monday, 1 December 2008 01:45:14 UTC