Re: Possible New Model (was Re: status report - formal layer) from Phil Archer on 2008-01-22 (public-powderwg@w3.org from January 2008)

From: Phil Archer <parcher@icra.org>
Date: Tue, 22 Jan 2008 09:34:52 +0000
To: public-powderwg@w3.org
CC: Jeremy Carroll <jjc@hpl.hp.com>
Message-ID: <4795B8BC.3050107@icra.org>
There are some errata in this e-mail.

Example 1 - the ex:shape property should have the value "square"

Example 4, DR_2 - the <includePathStartsWith> property should be 
deleted, i.e. the URI sets of both DRs in Example 4 should be identical 
(but the descriptions are different).

The sample files and graphs have been corrected.

Phil Archer wrote:
> 
> As promised on Friday afternoon, here are some musings on a possible new 
> structure for Operational POWDER (POWDER-O), taking into account the 
> recent discussion.
> 
> 
> Example 1
> =========
> RDF/XML: http://www.fosi.org/projects/powder/dr-o1.rdf
> Graph: http://www.fosi.org/projects/powder/servlet_63142.png
> 
> Let's begin with a simple case, that all resources dereferenced from a 
> URI with a host component ending with example.org are red and square (I 
> got these line numbers from the RDF validator but I've knocked off the 
> namespace declarations).
> 
> 11:   <rdf:Description rdf:about="">
> 12:     <foaf:maker 
> rdf:resource="http://authority.example.org/foaf.rdf#me" />
> 13:     <dcterms:issued>2007-12-14</dcterms:issued>
> 14:     <wdr:validFrom>2008-01-01</wdr:validFrom>
> 15:     <wdr:validUntil>2008-12-31</wdr:validUntil>
> 16:   </rdf:Description>
> 17:
> 18:   <wdr:DR rdf:ID="DR_1">
> 19:     <wdr:hasScope rdf:parseType="Resource">
> 20:       <wdr:includeHosts>example.org</wdr:includeHosts>
> 21:       <wdr:hasDescriptors rdf:parseType="Resource">
> 22:         <ex:colour>red</ex:colour>
> 23:         <ex:shape>red</ex:shape>
> 24:       </wdr:hasDescriptors>
> 25:     </wdr:hasScope>
> 26:   </wdr:DR>
> 
> The document was made by "...#me" on 14th December 2007 and as valid for 
> 2008. If you trust "...#me" and today's date is in 2008, then the DR in 
> the document can be transformed into its semantic encoding (DR-S) and 
> the RDF merged into your triple store for processing.
> 
> The single DR in the document has:
> 1. a URI set defined solely in terms of its host (example.org) and there 
> are two descriptors from the ex namespace.
> 
> Now let's make it a little more complicated and have one DR but two URI 
> sets.
> 
> Example 2
> =========
> RDF/XML: http://www.fosi.org/projects/powder/dr-o2.rdf
> Graph: http://www.fosi.org/projects/powder/servlet_63180.png
> 
> 11:   <rdf:Description rdf:about="">
> 12:     <foaf:maker 
> rdf:resource="http://authority.example.org/foaf.rdf#me" />
> 13:     <dcterms:issued>2007-12-14</dcterms:issued>
> 14:     <wdr:validFrom>2008-01-01</wdr:validFrom>
> 15:     <wdr:validUntil>2008-12-31</wdr:validUntil>
> 16:   </rdf:Description>
> 17:
> 18:   <wdr:DR rdf:ID="DR_1">
> 19:
> 20:     <wdr:hasScope rdf:parseType="Collection">
> 21:
> 22:       <wdr:URIset rdf:ID="URIset_1">
> 23:         <wdr:includeHosts>example.org</wdr:includeHosts>
> 24:         <wdr:includePathStartsWith>/foo</wdr:includePathStartsWith>
> 25:         <wdr:hasDescriptors rdf:parseType="Resource">
> 26:           <ex:colour>red</ex:colour>
> 27:         </wdr:hasDescriptors>
> 28:         </wdr:URIset>
> 29:
> 30:        <wdr:URIset rdf:about="#URIset_2" />
> 31:
> 32:      </wdr:hasScope>
> 33:
> 34:   </wdr:DR>
> 35:
> 36:   <wdr:URIset rdf:ID="URIset_2">
> 37:     <wdr:includeHosts>example.org</wdr:includeHosts>
> 38:     <wdr:hasDescriptors rdf:parseType="Resource">
> 39:       <ex:colour>blue</ex:colour>
> 40:     </wdr:hasDescriptors>
> 41:   </wdr:URIset>
> 
> The attribution and validity information (lines 11 - 16) remain 
> unchanged. Now though our DR contains two URIsets and two attendant 
> descriptions. All resources identified by URIs with that have host 
> components ending with example.org are blue except those where the path 
> starts with /foo which are red.
> 
> If the trust and validity information is to your satisfaction, then you 
> can transform this operational data into two DR-S instances and merge 
> the RDF. N.B. The transformation must contain data from both URI sets thus:
> 
> <wdr:URISet rdf:ID="URISet_1">
>   <owl:intersectionOf rdf:parseType="Collection">
>     <owl:Restriction>
>       <owl:onProperty rdf:resource="&wdr;includeHosts" />
>       <owl:hasValue>example.org</owl:hasValue>
>     </owl:Restriction>
>     <owl:Restriction>
>       *<owl:onProperty rdf:resource="&wdr;includePathStartsWith" />*
>       <owl:hasValue>/foo</owl:hasValue>
>     </owl:Restriction>
>   </owl:intersectionOf>
> </wdr:URISet>
> 
> And
> 
> <wdr:URISet rdf:ID="URISet_2">
>   <owl:intersectionOf rdf:parseType="Collection">
>     <owl:Restriction>
>       <owl:onProperty rdf:resource="&wdr;includeHosts" />
>       <owl:hasValue>example.org</owl:hasValue>
>     </owl:Restriction>
>     <owl:Restriction>
>       *<owl:onProperty rdf:resource="&wdr;excludePathStartsWith" />*
>       <owl:hasValue>/foo</owl:hasValue>
>     </owl:Restriction>
>   </owl:intersectionOf>
> </wdr:URISet>
> 
> The exclude path starts with property in URISet 2 is generated by:
> 
> 1. Noting the defining features of URISet 1 (example.org and /foo)
> 2. Noting the defining features of URISet 2 (example.org)
> 3. Taking the inverse of those present in 1 but absent in 2. 
> (example.org and NOT /foo)
> 
> This is going to get complicated when there are n URISets in the 
> sequence but hey, let's be optimistic...
> 
> We can add further DRs into the document iff the validity information is 
> the same. I won't copy it out here but such an example is available:
> 
> Example 3
> =========
> RDF/XML: http://www.fosi.org/projects/powder/dr-o3.rdf
> Graph: http://www.fosi.org/projects/powder/servlet_63202.png
> 
> This says that everything on example.org is blue except things with a 
> URI path starting with /foo which are red. Separately, everything on 
> example.org is circular except things with a URI path starting with /bar 
> which is square. All assertions are subject to the same validity/trust 
> conditions. And you process the Collections within each of the DRs to 
> get these results or, if you want to merge the RDF into your triple 
> store, you perform the transformation to get the full OWL-based version.
> 
> Two people providing different descriptions with different validity 
> dates will need to publish separate RDF/XML instances.
> 
> N.B.
> 
> I have written these examples with the exceptional case written within 
> the DR and the 'default' description (there's no other word for it) as a 
> separate block. There is no difference in structure but it does reflect 
> what I expect to be the workflow reality - we're describing everything 
> in a given URI set as being like _this_ *except* things that have 
> _these_ things in their URIs which are like _this_ instead. Less 
> abstract: there is no sex, drugs or rock and roll on fosi.org _except_ 
> where the URI path component begins with /associates where there might 
> be. Operationally, you write the general case first and then worry about 
> the exceptions in a separate thought process.
> 
> All of the RDF/XML instances given so far have a single URI so that a 
> Web site can include an identical link element pointing to such a file 
> irrespective of whether the resource is red, blue, square or circular - 
> the POWDER client will sort it out.
> 
> But, where the Web site has no (discernible or usable) URI structure, 
> this isn't good enough. We need to include a new conditional statement, 
> linkFrom, that says that if you trust "...#me" and today's date is 
> within range *and* the resource includes a link to a specific DR, _then_ 
> it is valid.
> 
> The next example shows this.
> 
> Example 4
> =========
> RDF/XML: http://www.fosi.org/projects/powder/dr-o4.rdf
> Graph: http://www.fosi.org/projects/powder/servlet_63230.png
> 
> 11:   <rdf:Description rdf:about="">
> 12:     <foaf:maker 
> rdf:resource="http://authority.example.org/foaf.rdf#me" />
> 13:     <dcterms:issued>2007-12-14</dcterms:issued>
> 14:     <wdr:validFrom>2008-01-01</wdr:validFrom>
> 15:     <wdr:validUntil>2008-12-31</wdr:validUntil>
> 16:     <wdr:linkFrom>true</wdr:linkFrom>
> 17:   </rdf:Description>
> 18:
> 19:   <wdr:DR rdf:ID="DR_1">
> 20:     <wdr:hasScope rdf:parseType="Resource">
> 21:       <wdr:includeHosts>example.org</wdr:includeHosts>
> 22:       <wdr:hasDescriptors rdf:parseType="Resource">
> 23:         <ex:texture>smooth</ex:texture>
> 24:       </wdr:hasDescriptors>
> 25:     </wdr:hasScope>
> 26:   </wdr:DR>
> 27:
> 28:
> 29:   <wdr:DR rdf:ID="DR_2">
> 30:     <wdr:hasScope rdf:parseType="Resource">
> 31:       <wdr:includeHosts>example.org</wdr:includeHosts>
> 32:       <wdr:includePathStartsWith>/sawn</wdr:includePathStartsWith>
> 33:       <wdr:hasDescriptors rdf:parseType="Resource">
> 34:         <ex:texture>rough</ex:texture>
> 35:       </wdr:hasDescriptors>
> 36:     </wdr:hasScope>
> 37:   </wdr:DR>
> 
> Notice line 16 which introduces the linkFrom element.
> 
> Then the URI sets for each DR are identical - everything on example.org 
> - you need to refer to the linkFrom element to decide which is 
> applicable (actually, you'd probably define the URI set once in a 
> separate block and refer to it from both DRs). I know this is a pain - 
> deliberately publishing two sets of triples that say different things 
> about the same subjects - but, as they say in Vladivostok, c'ést la Guerre.
> 
> What about storing lots of DRs in a single RDF/XML instance? Well, it's 
> clear that we can't. If you're a content provider and you need to have 
> several DRs covering different domains of interest then you simply 
> create multiple RDF/XML files and link as you need to. If you're a 
> labelling authority you store your DRs in a database with a front end 
> like, oh I dunno, http://repository.icra.org/label?id=1 :-) which calls, 
> I mean, would call a script that would return a single RDF/XML instance.
> 
> This might mean we re-visit the issue of whether we want to put a hint 
> in the link element as to what vocabularies are used in a given DR.
> 
> If this is along the right lines then it seems to me we need to revisit 
> the question of whether a DR-O is written in RDF/XML, just XML or 
> something in between. All the examples above are valid RDF/XML but we 
> have not got rid of the problem of generic RDF tools sucking in the 
> triples and trying to make sense of them out of context. Personally I'm 
> tending towards DR-Os being written in XML with only DR-Ss in RDF. It 
> seems that GRDDL cannot transform RDF/XML and, I think I'm right in 
> saying, that XSLT will have difficulty too.
> 
> I'll do some more playing around with that now...
> 
> Phil.
> 
> 
> 
> Phil Archer wrote:
>>
>> Good, this feels as if we're making progress (or rather, you're making 
>> progress in a promising direction :-)).
>>
>> I'll do some more playing around on Monday morning and see if I come 
>> up against anything we're missing.
>>
>> Have a good weekend and thank you.
>>
>> Phil.
>>
>> Jeremy Carroll wrote:
>>>
>>> Phil Archer wrote:
>>>>
>>>>
>>>> Jeremy Carroll wrote:
>>>> [snip]
>>>>>
>>>>> If we choose to make the GRDDL transform make the DR-S include the 
>>>>> subClassOf relationship as above, then we have the issue that in a 
>>>>> package (or any collection of DRs) some of the DRs may be valid and 
>>>>> some may be invalid, and all the subClassOf relationships are in 
>>>>> the same file, and it is unclear how to distinguish the ones we 
>>>>> want to claim (the valid ones), from the ones we don't (the invalid 
>>>>> ones).
>>>>
>>>> I take this point. It may be that we can do something about it 
>>>> though. We have so far taken the view that a DR should be 
>>>> self-contained and that a package is therefore a group of 
>>>> self-contained units. Doing this means that the validity information 
>>>> (and attribution) is NOT inherited by DRs in the package. However... 
>>>> we then had to introduce the idea of using dcterms:isPartOf to force 
>>>> the processing of these "discrete DRs" in a particular order [1]. In 
>>>> such a scenario, yes, each DR would have its own validity and 
>>>> attribution.
>>>>
>>>> But it doesn't have to be this way...
>>>>
>>>> It would be possible I think to work with the package carrying the 
>>>> validity information that was then inherited by the DRs within that 
>>>> package - which I think from what you say would make life easier?
>>>>
>>>>
>>>
>>>
>>> Yes - I was thinking along these lines.
>>>
>>> I was discussing this with Stuart - a possible view is then:
>>>
>>> The unit of a POWDER description is a document, which may contain a 
>>> single wdr:DR or a single wdr:Package.
>>>
>>> Either way the document has information pertinent to the relevance of 
>>> the document:
>>> e.g. validity and who vouches for it.
>>>
>>> Operationally the process of trust is as follows:
>>>
>>> for each possible document that you might be considering, you read 
>>> that document, then understanding what that document says about 
>>> itself, if you are satisfied that you want to act on that document 
>>> (e.g. it is valid, and is vouched for by an appropriate authority), 
>>> then you load it into your knowledge base (formally corresponding to 
>>> an RDF merge using the POWDER-S GRDDL result)
>>>
>>> The resulting RDF graph consists of only valid POWDER DRs, which have 
>>> been vouched for by appropriate authorities.
>>>
>>>  From the formal side the motivations for doing this way are:
>>> - it is known that temporal logics (i.e. dealing with time in a 
>>> logical way) is a hard problem
>>> - it is known that dealing with trust in logic is a hard problem
>>> - it is clear that POWDER deals with both time and trust, but in 
>>> simple ways
>>> - hence it feels inappropriate to do the time and trust parts in the 
>>> formal logical layer, but to deal with them in a pragmatic layer 
>>> prior to, but informed by, the logical treatment
>>>
>>> It is a limitation of the current RDF technology that it is hard to 
>>> talk about part of an RDF graph, and its validity, or who vouches for 
>>> that part - hence the desire to talk about documents containing 
>>> RDF/XML that expresses those parts of the graph.
>>> I think it is possible to design documents of the 'right' size so that:
>>>
>>> - validity and vouching are pertinent on a document by document level 
>>> (and not on a finer grain)
>>> - documents are large enough that the small scale powder user need 
>>> only write one document for their site, or maybe two.
>>>
>>> - that the expectations on large publishers who may need to make 
>>> declarations that fit into complex workflows are intelligible and not 
>>> too burdensome.
>>>
>>> Jeremy
>>>
>
Received on Tuesday, 22 January 2008 09:35:18 UTC