Re: POWDER: thoughts

Jeremy Carroll wrote:
> - resource descriptions and monotonicity
>   I got a bad non-monotonic feeling while reading the powder-grouping 
> WD; interestingly it was while reading bits that had clearly been 
> written with the issue in mind :(
> 

This is getting harder ...

At first blush, the POWDER grouping document seems to have been written 
with a view to the additive nature of RDF, and hence to respect 
monotonicity.

The formal semantic definition I gave on friday for includeHosts follows 
that pattern.

But .... it doesn't do what is wanted.

So here goes trying to describe what's written, and why it isn't what is 
required; and I'm trying to get together in my head a positive solution, 
which hides some of the complexity ... but that'll have to be in later 
message. So while this is a negative message, please don't take it 
pessimistically.

So, as well as say includeHosts, POWDER also allows to match on other 
aspects of the URI e.g. includeSchemes.

<wdr:ResourceSet rdf:ID="A">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
   <wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>

<wdr:ResourceSet rdf:ID="B">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>


<wdr:ResourceSet rdf:ID="C">
   <wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>

gives three resource sets.

The formal semantics of includeHosts I gave on Friday, suggest that the 
subject is a class all of whose members relate to the relevant host.

Thus the interpretation I(#A) will have a class extension that all come 
from example.org, and so could also be the same as the class extension 
of I(#C).
This is to reflect the monoticity in that the addition of the first 
includeSchemes triple prohibits certain interpretations, but doesn't 
license any interpretation that was not licensed in the first place.

But this contrasts directly with the explicit objective from the 
grouping WD:

http://www.w3.org/TR/2007/WD-powder-grouping-20071031/#design

2 It must be possible to determine with certainty whether a given 
resource is or is not an element of the Resource Set

Thus a resource identified by
   ftp://www.example.org/pub/foo.txt
is necessarily in #C, but not in #A or #B.

In terms of OWL 1.1:

  we could imagine a magic property hasURI which is given the obvious 
semantics via a semantic extension (this would be slightly easier to 
specify than the includeHosts property).

Then each of the properties in the groupings document can be seen as 
restrictions on the hasURI property, with appropriate user defined 
datatype to define the match.

e.g.

#A wdr:includeSchemes "http https" .

<==>

#A rdfs:subClassOf _:r .
_:r rdf:type owl:Restriction .
_:r owl:onProperty wdr:hasURI .
_:r owl:someValuesFrom _:d .
_:d rdf:type owl:DataRange .
_:d owl11:derivedFrom xsd:anyURI .
_:d owl11:onFacet  xsd:pattern .
_:d owl11:constraint "^(http|https):" .


i.e. we consider the class of all things that have a hasURI property 
with a value which conforms with the (anonymous) datatype derived from 
xsd:anyURI, matching the given pattern (which actually needs to be a bit 
more complicated, since schemes are case insensitive).

The key thing to note is that this a subClassOf triple, whereas the 
quoted goal #2, actually wants a fixed class definition, as the 
intersection of all the restrictions given.
So that, for #B which is defined using only the includeSchemes we would get:
#B owl:sameClassAs _:r .
_:r rdf:type owl:Restriction .
_:r owl:onProperty wdr:hasURI .
_:r owl:someValuesFrom _:d .
_:d rdf:type owl:DataRange .
_:d owl11:derivedFrom xsd:anyURI .
_:d owl11:onFacet  xsd:pattern .
_:d owl11:constraint "^(http|https):" .

(only the first triple is different)

And for #A we would define it as being the intersection of the two 
restrictions, using an owl:intersectionOf corresponding to the equation 
given in
http://www.w3.org/TR/2007/WD-powder-grouping-20071031/#methOutline

[roughly:
RS = DRSI = D1I ∩ D2I ∩ … ∩ DnI = (D1 ∧ D2 ∧ … ∧ Dn)I.
]



As a bullet-proof example of why the example in the grouping document 
does not respect the RDF semantics look at:





<wdr:ResourceSet rdf:ID="A">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
   <wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>

<wdr:ResourceSet rdf:ID="A">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>

By RDF Concepts and RDF Semantics the repeated includeSchemes triple 
counts only once, so that this is equivalent to:

<wdr:ResourceSet rdf:ID="A">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
   <wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>


i.e. http://example.com/ is not part of the resource set

But ... by looking at

<wdr:ResourceSet rdf:ID="A">
   <wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>

we see that it is ...

So in some way the monotonic discipline of RDF appears to be too severe.

Jeremy

Received on Monday, 17 December 2007 12:02:29 UTC