- From: Phil Archer <parcher@icra.org>
- Date: Mon, 21 Jan 2008 15:18:35 +0000
- To: Jeremy Carroll <jjc@hpl.hp.com>
- CC: public-powderwg@w3.org
As promised on Friday afternoon, here are some musings on a possible new structure for Operational POWDER (POWDER-O), taking into account the recent discussion. Example 1 ========= RDF/XML: http://www.fosi.org/projects/powder/dr-o1.rdf Graph: http://www.fosi.org/projects/powder/servlet_63142.png Let's begin with a simple case, that all resources dereferenced from a URI with a host component ending with example.org are red and square (I got these line numbers from the RDF validator but I've knocked off the namespace declarations). 11: <rdf:Description rdf:about=""> 12: <foaf:maker rdf:resource="http://authority.example.org/foaf.rdf#me" /> 13: <dcterms:issued>2007-12-14</dcterms:issued> 14: <wdr:validFrom>2008-01-01</wdr:validFrom> 15: <wdr:validUntil>2008-12-31</wdr:validUntil> 16: </rdf:Description> 17: 18: <wdr:DR rdf:ID="DR_1"> 19: <wdr:hasScope rdf:parseType="Resource"> 20: <wdr:includeHosts>example.org</wdr:includeHosts> 21: <wdr:hasDescriptors rdf:parseType="Resource"> 22: <ex:colour>red</ex:colour> 23: <ex:shape>red</ex:shape> 24: </wdr:hasDescriptors> 25: </wdr:hasScope> 26: </wdr:DR> The document was made by "...#me" on 14th December 2007 and as valid for 2008. If you trust "...#me" and today's date is in 2008, then the DR in the document can be transformed into its semantic encoding (DR-S) and the RDF merged into your triple store for processing. The single DR in the document has: 1. a URI set defined solely in terms of its host (example.org) and there are two descriptors from the ex namespace. Now let's make it a little more complicated and have one DR but two URI sets. Example 2 ========= RDF/XML: http://www.fosi.org/projects/powder/dr-o2.rdf Graph: http://www.fosi.org/projects/powder/servlet_63180.png 11: <rdf:Description rdf:about=""> 12: <foaf:maker rdf:resource="http://authority.example.org/foaf.rdf#me" /> 13: <dcterms:issued>2007-12-14</dcterms:issued> 14: <wdr:validFrom>2008-01-01</wdr:validFrom> 15: <wdr:validUntil>2008-12-31</wdr:validUntil> 16: </rdf:Description> 17: 18: <wdr:DR rdf:ID="DR_1"> 19: 20: <wdr:hasScope rdf:parseType="Collection"> 21: 22: <wdr:URIset rdf:ID="URIset_1"> 23: <wdr:includeHosts>example.org</wdr:includeHosts> 24: <wdr:includePathStartsWith>/foo</wdr:includePathStartsWith> 25: <wdr:hasDescriptors rdf:parseType="Resource"> 26: <ex:colour>red</ex:colour> 27: </wdr:hasDescriptors> 28: </wdr:URIset> 29: 30: <wdr:URIset rdf:about="#URIset_2" /> 31: 32: </wdr:hasScope> 33: 34: </wdr:DR> 35: 36: <wdr:URIset rdf:ID="URIset_2"> 37: <wdr:includeHosts>example.org</wdr:includeHosts> 38: <wdr:hasDescriptors rdf:parseType="Resource"> 39: <ex:colour>blue</ex:colour> 40: </wdr:hasDescriptors> 41: </wdr:URIset> The attribution and validity information (lines 11 - 16) remain unchanged. Now though our DR contains two URIsets and two attendant descriptions. All resources identified by URIs with that have host components ending with example.org are blue except those where the path starts with /foo which are red. If the trust and validity information is to your satisfaction, then you can transform this operational data into two DR-S instances and merge the RDF. N.B. The transformation must contain data from both URI sets thus: <wdr:URISet rdf:ID="URISet_1"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="&wdr;includeHosts" /> <owl:hasValue>example.org</owl:hasValue> </owl:Restriction> <owl:Restriction> *<owl:onProperty rdf:resource="&wdr;includePathStartsWith" />* <owl:hasValue>/foo</owl:hasValue> </owl:Restriction> </owl:intersectionOf> </wdr:URISet> And <wdr:URISet rdf:ID="URISet_2"> <owl:intersectionOf rdf:parseType="Collection"> <owl:Restriction> <owl:onProperty rdf:resource="&wdr;includeHosts" /> <owl:hasValue>example.org</owl:hasValue> </owl:Restriction> <owl:Restriction> *<owl:onProperty rdf:resource="&wdr;excludePathStartsWith" />* <owl:hasValue>/foo</owl:hasValue> </owl:Restriction> </owl:intersectionOf> </wdr:URISet> The exclude path starts with property in URISet 2 is generated by: 1. Noting the defining features of URISet 1 (example.org and /foo) 2. Noting the defining features of URISet 2 (example.org) 3. Taking the inverse of those present in 1 but absent in 2. (example.org and NOT /foo) This is going to get complicated when there are n URISets in the sequence but hey, let's be optimistic... We can add further DRs into the document iff the validity information is the same. I won't copy it out here but such an example is available: Example 3 ========= RDF/XML: http://www.fosi.org/projects/powder/dr-o3.rdf Graph: http://www.fosi.org/projects/powder/servlet_63202.png This says that everything on example.org is blue except things with a URI path starting with /foo which are red. Separately, everything on example.org is circular except things with a URI path starting with /bar which is square. All assertions are subject to the same validity/trust conditions. And you process the Collections within each of the DRs to get these results or, if you want to merge the RDF into your triple store, you perform the transformation to get the full OWL-based version. Two people providing different descriptions with different validity dates will need to publish separate RDF/XML instances. N.B. I have written these examples with the exceptional case written within the DR and the 'default' description (there's no other word for it) as a separate block. There is no difference in structure but it does reflect what I expect to be the workflow reality - we're describing everything in a given URI set as being like _this_ *except* things that have _these_ things in their URIs which are like _this_ instead. Less abstract: there is no sex, drugs or rock and roll on fosi.org _except_ where the URI path component begins with /associates where there might be. Operationally, you write the general case first and then worry about the exceptions in a separate thought process. All of the RDF/XML instances given so far have a single URI so that a Web site can include an identical link element pointing to such a file irrespective of whether the resource is red, blue, square or circular - the POWDER client will sort it out. But, where the Web site has no (discernible or usable) URI structure, this isn't good enough. We need to include a new conditional statement, linkFrom, that says that if you trust "...#me" and today's date is within range *and* the resource includes a link to a specific DR, _then_ it is valid. The next example shows this. Example 4 ========= RDF/XML: http://www.fosi.org/projects/powder/dr-o4.rdf Graph: http://www.fosi.org/projects/powder/servlet_63230.png 11: <rdf:Description rdf:about=""> 12: <foaf:maker rdf:resource="http://authority.example.org/foaf.rdf#me" /> 13: <dcterms:issued>2007-12-14</dcterms:issued> 14: <wdr:validFrom>2008-01-01</wdr:validFrom> 15: <wdr:validUntil>2008-12-31</wdr:validUntil> 16: <wdr:linkFrom>true</wdr:linkFrom> 17: </rdf:Description> 18: 19: <wdr:DR rdf:ID="DR_1"> 20: <wdr:hasScope rdf:parseType="Resource"> 21: <wdr:includeHosts>example.org</wdr:includeHosts> 22: <wdr:hasDescriptors rdf:parseType="Resource"> 23: <ex:texture>smooth</ex:texture> 24: </wdr:hasDescriptors> 25: </wdr:hasScope> 26: </wdr:DR> 27: 28: 29: <wdr:DR rdf:ID="DR_2"> 30: <wdr:hasScope rdf:parseType="Resource"> 31: <wdr:includeHosts>example.org</wdr:includeHosts> 32: <wdr:includePathStartsWith>/sawn</wdr:includePathStartsWith> 33: <wdr:hasDescriptors rdf:parseType="Resource"> 34: <ex:texture>rough</ex:texture> 35: </wdr:hasDescriptors> 36: </wdr:hasScope> 37: </wdr:DR> Notice line 16 which introduces the linkFrom element. Then the URI sets for each DR are identical - everything on example.org - you need to refer to the linkFrom element to decide which is applicable (actually, you'd probably define the URI set once in a separate block and refer to it from both DRs). I know this is a pain - deliberately publishing two sets of triples that say different things about the same subjects - but, as they say in Vladivostok, c'ést la Guerre. What about storing lots of DRs in a single RDF/XML instance? Well, it's clear that we can't. If you're a content provider and you need to have several DRs covering different domains of interest then you simply create multiple RDF/XML files and link as you need to. If you're a labelling authority you store your DRs in a database with a front end like, oh I dunno, http://repository.icra.org/label?id=1 :-) which calls, I mean, would call a script that would return a single RDF/XML instance. This might mean we re-visit the issue of whether we want to put a hint in the link element as to what vocabularies are used in a given DR. If this is along the right lines then it seems to me we need to revisit the question of whether a DR-O is written in RDF/XML, just XML or something in between. All the examples above are valid RDF/XML but we have not got rid of the problem of generic RDF tools sucking in the triples and trying to make sense of them out of context. Personally I'm tending towards DR-Os being written in XML with only DR-Ss in RDF. It seems that GRDDL cannot transform RDF/XML and, I think I'm right in saying, that XSLT will have difficulty too. I'll do some more playing around with that now... Phil. Phil Archer wrote: > > Good, this feels as if we're making progress (or rather, you're making > progress in a promising direction :-)). > > I'll do some more playing around on Monday morning and see if I come up > against anything we're missing. > > Have a good weekend and thank you. > > Phil. > > Jeremy Carroll wrote: >> >> Phil Archer wrote: >>> >>> >>> Jeremy Carroll wrote: >>> [snip] >>>> >>>> If we choose to make the GRDDL transform make the DR-S include the >>>> subClassOf relationship as above, then we have the issue that in a >>>> package (or any collection of DRs) some of the DRs may be valid and >>>> some may be invalid, and all the subClassOf relationships are in the >>>> same file, and it is unclear how to distinguish the ones we want to >>>> claim (the valid ones), from the ones we don't (the invalid ones). >>> >>> I take this point. It may be that we can do something about it >>> though. We have so far taken the view that a DR should be >>> self-contained and that a package is therefore a group of >>> self-contained units. Doing this means that the validity information >>> (and attribution) is NOT inherited by DRs in the package. However... >>> we then had to introduce the idea of using dcterms:isPartOf to force >>> the processing of these "discrete DRs" in a particular order [1]. In >>> such a scenario, yes, each DR would have its own validity and >>> attribution. >>> >>> But it doesn't have to be this way... >>> >>> It would be possible I think to work with the package carrying the >>> validity information that was then inherited by the DRs within that >>> package - which I think from what you say would make life easier? >>> >>> >> >> >> Yes - I was thinking along these lines. >> >> I was discussing this with Stuart - a possible view is then: >> >> The unit of a POWDER description is a document, which may contain a >> single wdr:DR or a single wdr:Package. >> >> Either way the document has information pertinent to the relevance of >> the document: >> e.g. validity and who vouches for it. >> >> Operationally the process of trust is as follows: >> >> for each possible document that you might be considering, you read >> that document, then understanding what that document says about >> itself, if you are satisfied that you want to act on that document >> (e.g. it is valid, and is vouched for by an appropriate authority), >> then you load it into your knowledge base (formally corresponding to >> an RDF merge using the POWDER-S GRDDL result) >> >> The resulting RDF graph consists of only valid POWDER DRs, which have >> been vouched for by appropriate authorities. >> >> From the formal side the motivations for doing this way are: >> - it is known that temporal logics (i.e. dealing with time in a >> logical way) is a hard problem >> - it is known that dealing with trust in logic is a hard problem >> - it is clear that POWDER deals with both time and trust, but in >> simple ways >> - hence it feels inappropriate to do the time and trust parts in the >> formal logical layer, but to deal with them in a pragmatic layer prior >> to, but informed by, the logical treatment >> >> It is a limitation of the current RDF technology that it is hard to >> talk about part of an RDF graph, and its validity, or who vouches for >> that part - hence the desire to talk about documents containing >> RDF/XML that expresses those parts of the graph. >> I think it is possible to design documents of the 'right' size so that: >> >> - validity and vouching are pertinent on a document by document level >> (and not on a finer grain) >> - documents are large enough that the small scale powder user need >> only write one document for their site, or maybe two. >> >> - that the expectations on large publishers who may need to make >> declarations that fit into complex workflows are intelligible and not >> too burdensome. >> >> Jeremy >>
Received on Monday, 21 January 2008 15:18:56 UTC