Re: status report - formal layer from Phil Archer on 2008-01-18 (public-powderwg@w3.org from January 2008)

From: Phil Archer <parcher@icra.org>
Date: Fri, 18 Jan 2008 11:15:25 +0000
To: Jeremy Carroll <jjc@hpl.hp.com>
CC: public-powderwg@w3.org
Message-ID: <47908A4D.1080601@icra.org>
Jeremy Carroll wrote:
[snip]
> 
> If we choose to make the GRDDL transform make the DR-S include the 
> subClassOf relationship as above, then we have the issue that in a 
> package (or any collection of DRs) some of the DRs may be valid and some 
> may be invalid, and all the subClassOf relationships are in the same 
> file, and it is unclear how to distinguish the ones we want to claim 
> (the valid ones), from the ones we don't (the invalid ones).

I take this point. It may be that we can do something about it though. 
We have so far taken the view that a DR should be self-contained and 
that a package is therefore a group of self-contained units. Doing this 
means that the validity information (and attribution) is NOT inherited 
by DRs in the package. However... we then had to introduce the idea of 
using dcterms:isPartOf to force the processing of these "discrete DRs" 
in a particular order [1]. In such a scenario, yes, each DR would have 
its own validity and attribution.

But it doesn't have to be this way...

It would be possible I think to work with the package carrying the 
validity information that was then inherited by the DRs within that 
package - which I think from what you say would make life easier?

The structure of a package then becomes something like this

1  <wdr:DR rdf:ID="DR_1">

2    <foaf:maker rdf:resource="http://authority.example.org/foaf.rdf#me" />
3    <dcterms:issued>2007-12-14</dcterms:issued>
4    <wdr:validFrom>2008-01-01</wdr:validFrom>
5    <wdr:validUntil>2008-12-31</wdr:validUntil>

6    <wdr:hasScope parseType="Collection">
7      <wdr:URIset rdf:resource="#URIset_1" />
8      <wdr:URIset rdf:resource="#URIset_2" />
9    </wdr:hasScope>

10 </wdr:DR>

11  <wdr:URIset rdf:ID="URIset_1">
12    <wdr:includeHosts>example.org</wdr:includeHosts>
13    <wdr:includePathStartsWith>/foo</wdr:includePathStartsWith>
14    <wdr:hasDescriptors rdf:parseType="Resource">
15      <ex:colour>red<ex:colour>
16      <ex:shape>circle</ex:shape>
17    </wdr:hasDescriptors>
18  </wdr:URIset>

19  <wdr:URIset rdf:ID="URIset_2">
20    <wdr:includeHosts>example.org</wdr:includeHosts>
21    <wdr:hasDescriptors rdf:parseType="Resource">
22      <ex:colour>blue<ex:colour>
23      <ex:shape>square</ex:shape>
24    </wdr:hasDescriptors>
25  </wdr:URIset>

So what this says is that everything on example.org is blue and square 
_except_ resources where the path starts with /foo which are red and 
circular. Both of these share the same validity information which is 
logical since any changes made will affect all the data. Notice also 
that the hasDescriptors property has now moved so that its domain is a 
URIset (which is something that Chaals has been arguing for since July 
and that has caused not a little discussion!)

Does this take some of the sharp edges off the square peg?

And those of use used to working with it will recognise that we're close 
to reinventing RDF-CL here - i.e. the way some of us do it now. See, for 
example [3].

Phil.


[1] http://www.w3.org/TR/2007/WD-powder-dr-20070925/#partOf
[2] http://www.w3.org/TR/2007/WD-powder-dr-20070925/#noPattern
[3] http://www.icra.org/labels.rdf

> 
>  > Let me explore this a little. I might have an RDF/XML file that contains
>  > 2 operational DRs:
>  >
>  > <wdr:DR rdf:ID="DR_1" >
>  >   ...
>  > </wdr:DR>
>  >
>  > <wdr:DR rdf:ID="DR_2" >
>  >   ...
>  > </wdr:DR>
>  >
>  > So each of these has its own URI with a fragment identifier such as
>  > http://example.org/powder.rdf#DR_1 but you're saying this isn't going to
>  > be enough and that what we really need is http://example.org/DR_1 and
>  > for this to return a single RDF/XML instance that contains just 'DR_1'?
>  >
>  > So practically I'd create my 2 DRs in the file as shown (so I can keep
>  > them all in one place) but I'd publish their identifiers in the form
>  > http://example.org/DR_1 and then do some server-side processing to
>  > extract the relevant DR from the repository and return it as a single
>  > RDF/XML instance?
>  >
>  > I can see why this is important for the semantics (so that the metadata
>  > about the DR can be published in a block where rdf:about="") but we need
>  > to find a way to avoid server-side processing. How much can XSLT/GRDDL
>  > do for us here? Would it be able to do what's necessary? Requiring
>  > server-side processing would mean that it was really only labelling
>  > organisations that would deploy POWDER and it wouldn't be something Joe
>  > Lambda could easily add to his own website.
> 
> 
> If we are going down the line of including the subClassOf triple then it 
> is certainly helpful to have each subClassOf triple in a different file 
> from the other subClassOf triples, because we can then apply pragmatic 
> methods for choosing which files to believe (the ones containing a valid 
> DR), and which files not to believe (the ones containing invalid DRs).
> 
> We can of course do things differently; and your arguments seem to 
> suggest that we should.
> 
> Here are two different ways:
> 
> A: don't include any subClassOf triples, but include that in the 
> additional semantics we give to *valid* DRs.
> I think I will migrate the Wiki stuff to this solution.
> It makes the formal semantics more problematic in that a date is 
> required ....
> 
> B: use XSLT 2.0 as our GRDDL processor, and go somewhat outside the 
> scope of GRDDL rec to have multiple results from one transform and so do 
> the named graphs approach but on the client side not the server side.
> (It might be possible to do this within the current GRDDL spec., with a 
> lot of ingenuity).
> 
> C: the named graph approach discussed above which, for some packages, 
> requires too many files, and misses one of your requirements (hand 
> editing possible)
> 
> 
> 
>  > I notice your suggestion of specifying a new grammar this isn't quite
>  > RDF/XML or XML and perhaps giving it a new MIME type. Last November we
>  > did consider an approach similar to what we're looking at now with an
>  > operational DR and a semantic one. The operational one would be written
>  > in XML, only the semantic one would be in RDF/OWL (this all came out of
>  > a conversation at TPAC with David Booth btw). When we looked at this as
>  > a group, the feeling was that we didn't want people to have to set
>  > different MIME types for two versions of a DR - because inevitably a lot
>  > of DR publishers would get it wrong - and that we'd be better sticking
>  > with one (or the other).
>  >
>  > So for now I'd say let's stick to RDF/XML for the operational DR. Yes,
>  > it means keeping in some striping that we could perhaps do without but
>  > that's a small price to pay. Since we're working in RDF/XML (and not
>  > other serialisations of RDF) we might make greater use of
>  > parseType="resource" (the range of hasScope is defined as a resource Set
>  > (soon to be URI set we know)) so this might be enough to knock out a
>  > couple of lines we don't need?
> 
> That's fine.
> 
> Jeremy
>
Received on Friday, 18 January 2008 11:15:53 UTC