Re: Another attempt at 'cascading DRs' from Phil Archer on 2008-02-12 (public-powderwg@w3.org from February 2008)

From: Phil Archer <parcher@icra.org>
Date: Tue, 12 Feb 2008 14:13:52 +0000
To: Public POWDER <public-powderwg@w3.org>
Message-ID: <47B1A9A0.1060407@icra.org>
The problem is that your DR_2 directly contradicts DR_1 and both are 
created by the same person, meaning "I think 
http://www.t-online.de/c/01/02/03/01020304.html is text AND I think it's 
an image." Something has to change. Yes, we can can have conflicting DRs 
but these would generally be created by different people, so it comes 
down to whom you trust. If the same person creates two conflicting DRs, 
you can't use trust as a means to separate them.

In my proposal today I've effectively removed the scope entirely from 
the DR so it becomes impossible to go from the DR to the resource - i.e. 
eliminates the first problem you identify.

On the upside, no machine can find the two descriptions and infer that 
they both apply to the same thing and therefore end up with a logical 
paradox. The downside is that you then can't use the DR as a discovery 
mechanism, only to provide information about what you already have.

That's perhaps a bit of an over-statement. If you publish a POWDER doc 
that has an aboutHosts attribute of t-online.de and you have a 
description that says 'text' and one that says 'image' then it would be 
reasonable to infer that t-online.de includes both text and images - you 
just wouldn't know exactly where they were. You would, however, be able 
to crawl the site looking for links to "...#text" and "...#image" and 
create a catalogue/site map with relative ease.

Taking your last paragraph: I'm not sure how you would define 'locally 
based' - it sounds very much like the linkFrom attribute already posited 
in the published doc [1] - i.e. you make it local by linking to it, yes? 
So the operation here is "use the cached DR that covers all of 
t-online.de" and the image would then have an HTTP header that linked to 
DR_2 that says "I can override DR_1." You still need the link to create 
the property of being local. OK, I can see that working operationally 
but it seems to have a real danger of creating data that, when taken in 
isolation, can be contradictory.

Another option might be to make more use of the issue date with the idea 
that if you have two conflicting DRs then the most recently issued one 
wins. What worries me here is the caching. If I get a DR that says 
t-online.de is all text, I might cache that. If there is no validUntil 
date, in theory it's valid until the entropic heat death of the 
universe. If there is a validUntil date, OK, I should only cache it up 
until that date. Either way, the new, more recent DR can legitimately be 
ignored by an optimised system until the original expires. Again, this 
problem disappears if there is no scope.

P

[1] http://www.w3.org/TR/2007/WD-powder-dr-20070925/#noPattern


Scheppe, Kai-Dietrich wrote:
> Hi,
> 
> I have a question:
> 
> The problem really doesn't exist when going from the resource to the DR.
> It does exist when going from the DR to the resource.
> 
> But since we say at some point that conflicting DRs can exist...afterall
> different CA could have different opinions about content, it is up to
> the user to decide which DR he believes.
> 
> Can't this principle apply here as well?
> 
> Or better, if it doesn't apply here, why does it apply in general?
> And if it does and if this is a problem, how do we solve it...with the
> knowledge that solving that problem would also solve this problem?
> 
> 
> Either way, I think if we just say that 
> 
> DR_1 says all content on t-online.de is text based
> DR_2 says that http://www.t-online.de/c/01/02/03/01020304.html is an
> image
> 
> then it is up to the peruser to decide whether to download this
> resource.
> 
> We could defuse the problem somewhat by requiring the more locally based
> DR to refer to the more globally based DR.  This way an application
> could create its own set of exceptions.
> So in the example above DR_2 would contain a link to DR_1.
> 
> 
> However, the problem centers on dealing with unknown DRs.
> 
> 
> -- Kai
> 
> 
> 
> 
> 
> 
>> -----Original Message-----
>> From: public-powderwg-request@w3.org 
>> [mailto:public-powderwg-request@w3.org] On Behalf Of Phil Archer
>> Sent: Tuesday, February 12, 2008 1:17 PM
>> To: Public POWDER
>> Subject: Another attempt at 'cascading DRs'
>>
>>
>> The basic POWDER model has a resource that describes a lot of 
>> other resources. A processor may start at the descriptive 
>> resource (the POWDER
>> document) and discover the resources it describes. To aid 
>> discovery of the description, a resource may link to to the 
>> POWDER document that describes it.
>>
>> In some important circumstances however this doesn't work. 
>> POWDER's Grouping mechanism [1] (currently under revision 
>> with a new draft due for publication v. soon) assumes that by 
>> examining a URI, one can deduce which description of a 
>> collection applies to it. If URIs don't follow a particular 
>> pattern, such as numerical URIs generated by some content 
>> management systems, we need a different mechanism: we must 
>> rely on the link from the described resource to point to the 
>> correct description.
>>
>> We've discussed this a lot, most recently in Athens, and we 
>> know we need to solve it. We also know that it's impractical 
>> in a commercial workflow to need to edit the POWDER document 
>> continually, adding in lists of exceptions to rules. We need 
>> to work more along the CSS model where there is a central 
>> file that carries the defined styles. Which style, indeed, 
>> which stylesheet, is applicable, is defined within the 
>> document for which it contains the styles. HTTP and client 
>> caching ensures that stylesheets need only be accessed once 
>> until updated.
>>
>> A Package of DRs, as currently defined at [2], has an 
>> attribute 'aboutHosts'. The structure of packages is going to 
>> be modified a little in the near future but this feature is a 
>> very useful one for processing efficiency. The plan now is to 
>> make it so that, where present, the aboutHosts guarantees 
>> that the DRs in the package do not cover any resources on 
>> domains other than those listed (it doesn't guarantee that 
>> all resources on those domains are described by the way, just 
>> that if the aboutHosts property lists example.org then you 
>> can be sure that it does not describe anything on example.com).
>>
>> OK, hold on to that and look at this:
>>
>> 1  <?xml version="1.0"?>
>> 2   <POWDER xmlns="http://www.w3.org/2007/05/powder#"
>> 3           xmlns:ex="http://example.org/vocab#"
>> 4           xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>
>> 5    <attribution>
>> 6      <maker>http://authority.example.org/foaf.rdf#me</maker>
>> 7      <aboutHosts>example.org</aboutHosts>
>> 8    </attribution>
>>
>> 9    <Descriptors xml:id="red">
>> 10     <ex:color>red</ex:color>
>> 11   </Descriptors>
>>
>> 12   <Descriptors xml:id="blue">
>> 13     <ex:color>blue</ex:color>
>> 14   </Descriptors>
>>
>> 15 </POWDER>
>>
>> This is POWDER doc with its required attribution. Line 7 adds 
>> in an aboutHosts element.
>>
>> But there are no DRs here, just two descriptions (and if 
>> there are no DRs there is no requirement for a URISet). A 
>> 'red page' on example.org would include a link element with 
>> an href attribute of "...powder.xml#red", blue pages would 
>> have #blue as the fragment identifier. The aboutHosts element 
>> prevents other domains pointing to this POWDER doc and 
>> claiming (quite possibly falsely) that the entity described 
>> at http://authority.example.org/foaf.rdf#me described them - 
>> that is, the POWDER doc author has a mechanism for 
>> restricting the scope of the descriptions without actually 
>> having a URISet.
>>
>> But... what's the POWDER-S version of this, i.e. the output 
>> of the GRDDL transform with formal semantics? Well, I guess 
>> it ends up just being:
>>
>> <rdf:Description rdf:about="">
>>    <foaf:maker 
>> rdf:resource="http://authority.example.org/foaf.rdf#me" /> 
>> </rdf:Description>
>>
>> <owl:Class rdf:nodeID="red">
>>    <owl:intersectionOf rdf:parseType="Collection">
>>      <owl:Restriction>
>>        <owl:onProperty 
>> rdf:resource="http://example.org/vocab#color" />
>>        <owl:hasValue>red</owl:hasValue>
>>      </owl:Restriction>
>>    </owl:intersectionOf>
>> </owl:Class>
>>
>> <owl:Class rdf:nodeID="blue">
>>    <owl:intersectionOf rdf:parseType="Collection">
>>      <owl:Restriction>
>>        <owl:onProperty 
>> rdf:resource="http://example.org/vocab#color" />
>>        <owl:hasValue>blue</owl:hasValue>
>>      </owl:Restriction>
>>    </owl:intersectionOf>
>> </owl:Class>
>>
>> Notice that a) the aboutHosts element is not copied from the 
>> operational semantics - it's not needed here and, I think I'm 
>> right in saying, won't be of any value in the ordered list in 
>> a POWDER-S doc either. It could be included but I'm not sure 
>> that it will add a great deal.
>> b) there is no subClassOf relation asserted - which is good because
>> c) there is no URIset to be a sub class of the descriptors.
>>
>>
>> Three questions for people with the appropriate knowledge:
>>
>> So the XSLT here must only assert the sub class relationship 
>> if there is a URISet. Doable?
>>
>> I understand that, formally, creating a blank node in an RDF 
>> graph means that the universe is so arranged that there is at 
>> least one resource that has the properties given by those of 
>> the blank node. Does creating an OWL class in this way get us 
>> off this hook?
>>
>> How does this look, Kai?
>>
>> N.B. I'm trying to avoid having to create server-side 
>> software that returns triples with the described resource's 
>> URI as the subject - that's clearly the semantically pure 
>> way, but it's impractical.
>>
>> I'm asking all this because it obviously affects the rules on 
>> what MUST and SHOULD and MAY be in a POWDER doc - something 
>> Andrea's poised to encode in the schema and Kevin is poised 
>> to enshrine in the XSLT.
>>
>> Phil.
>>
>>
>>
>> [1] http://www.w3.org/TR/2007/WD-powder-grouping-20071031/
>> [2] http://www.w3.org/TR/2007/WD-powder-dr-20070925/#package-structure
>>
>>
Received on Tuesday, 12 February 2008 14:14:14 UTC