Blank nodes in descriptor sets - a proposal to deal with this using Occam's Razor from Phil Archer on 2008-07-16 (public-powderwg@w3.org from July 2008)

From: Phil Archer <parcher@icra.org>
Date: Wed, 16 Jul 2008 13:06:32 +0100
To: Public POWDER <public-powderwg@w3.org>
CC: Ivan Herman <ivan@w3.org>, Dan Brickley <danbri@danbri.org>
Message-ID: <487DE448.1080402@icra.org>
Sorry this is long but it's kind of an argument made in public leading 
to a proposed resolution.

Ivan - I'm copying you in as this has been triggered by your comment and 
expertise - and it has a question for you which, I'm sorry, is quite a 
long way down.

Dan - I think this addresses the issue you've had with this from the 
start.

Picking up from Ivan's comments [1], clearly we need to do a bit of work 
on this 'no blank nodes' business. My own lack of understanding isn't 
much of a help here but given that we've been told explicitly that blank 
nodes are things we really can't have in a POWDER descriptor set, let's 
see what we have to do.

This is wrong:

1  <descriptorset>
2    <ex:material>
3      <ex:Wood>
4        <ex:finish rdf:resource="http://example.org/vocab#shiny"/>
5        <ex:madeof>cedar</ex:madeof>
6      </ex:Wood>
7     </ex:material>
8  </descriptorset>

because ex:Wood, although is is of a defined type(ex:Wood), it is still 
a blank node. If I understand Dan and JJC correctly, I believe writing 
this means that the universe is arranged in such a way that there is at 
least one resource that has the property ex:material with the value 
ex:Wood. Since, in a POWDER environment, this may not be true the logic 
doesn't hold.

In consequence, Stasinos has suggested to the group that we need to have 
something like this

1  <descriptorset>
2    <ex:material>
3      <ex:Wood ref="http://example.org/vocab#polishedcedar">
4        <ex:finish rdf:resource="http://example.org/vocab#shiny"/>
5        <ex:madeof>cedar</ex:madeof>
6      </ex:Wood>
7     </ex:material>
8  </descriptorset>

My problem is the 'ref' attribute. A descriptor set is meant to to carry 
RDF/XML directly so adding in non-RDF attributes strikes me as something 
to avoid if we can. rdf:ID or rdf:about is what we really need isn't it?

If the intention is to create or use a node within the 'ex' namespace 
then it needs to be rdf:about thus:

1  <descriptorset>
2    <ex:material>
3      <ex:Wood rdf:about="http://example.org/vocab#polishedcedar">
4        <ex:finish rdf:resource="http://example.org/vocab#shiny"/>
5        <ex:madeof>cedar</ex:madeof>
6      </ex:Wood>
7     </ex:material>
8  </descriptorset>

For our candidate resource <u>, writing this out long hand gives us 
these triples:

<u> ex:material	<http://example.org/vocab#polishedcedar>

<http://example.org/vocab#polishedcedar> rdf:type 
<http://example.org/vocab#Wood>

<http://example.org/vocab#polishedcedar> ex:finish 
<http://example.org/vocab#shiny>

<http://example.org/vocab#polishedcedar> ex:madeof "cedar"

Whilst this is OK, it's hardly efficient. It would be much more 
efficient to define polished cedar in the ex vocabulary (or some other 
vocabulary, it doesn't matter), and use 
rdf:resource="http://example.org/vocab#polishedcedar". So people should 
probably be warned off defining a description within a POWDER doc - but 
it is valid RDF and it does match the 'no blank nodes' rule.

Using RDF:ID gives a rather different (and almost certainly wrong) 
result since, if I've got this right (and I'm far from sure) polished 
cedar is now tied to <u> thus:

<u> ex:material <u#polishedcedar>

And u#polishedcedar becomes the subject of the other triples (and just 
to add spice, u might include a fragment of its own then we're
really messed up). I guess this _could_ be the intention but my hunch is 
that it will normally be an error... and this probably means we should 
see if we can ban the use of rdf:ID within a descriptor set.

Hmmm...

It seems to me that doing anything other than one of these two:

A) <ex:property1>literal value</ex:property>
B) <ex property2 rdf:resource="...#thing" />

is likely to be one of:

1. Bad semantics
2. Bad practice
3. Plain wrong

And even there you need to be careful that the URI in (B) is an instance 
and not a class!

Which I guess is why Stasinos suggested adding the ref attribute ;-)

So right now I'm tending towards suggesting that we only allow literal 
values or rdf:resource and no child elements of child elements of 
descriptorset.

Let's cycle back to a discussion we had at the f2f on Monday.

If we allow people to write their foaf:Agent or dcterms:Agent class 
directly into a POWDER doc that means that to process POWDER you MUST 
understand RDF. We decided that this was OK because you MUST understand 
RDF to process a descriptorset.

If we were to restrict a descriptorset to only allowing literals or 
rdf:resources then you could probably go a long way in processing POWDER 
without understanding RDF at all if you didn't want to. Let's take 
Ivan's copyright example. Cutting out some of the detail to fit in an 
e-mail format he has this:

<descriptorset>
   <displaytext>Logos must...</displaytext>
   <rdf:type rdf:resource="http://creativecommons.org/ns#Work"/>
   <rdfs:seeAlso rdf:resource="http://www.w3.org..."/>
   <xhtml:license rdf:resource='http://www.w3.org...'/>
   <cc:morePermissions rdf:resource='...'/>
   <cc:attributionURL rdf:reource='http://www.w3.org/2001/sw/'/>		
   <cc:attributionName>World Wide Web Consortium </cc:attributionName>
</descriptorset>

And the most complex thing there is rdf:type which we've discussed and 
decided what to do about. Do you need to understand RDF to process this? 
Well, yes _if_ you want to understand the semantics of 
http://creativecommons.org/ns#Work - but in a given application it may 
be enough to know it's there without doing any parsing?

So right now I'm thinking we should not only restrict descriptor sets to 
having child elements that have no child elements to avoid a lot of 
semantic hassle, but we could do the same for attribution and obviate 
the need to understand RDF at all to process POWDER at least at an 
operational level.

What would we lose?

Well, we'd lose the ability to put anything other than simple RDF in the 
descriptor set and that must surely mean a reduction in flexibility. It 
also makes some things a little more awkward. We were happily expecting 
all xx:Agent classes to be defined externally until Ivan said that the 
W3C hasn't got a FOAF file so can he put it in the POWDER file itself. 
So we made it possible... but, well, if we said no, sorry Ivan, it's 
about time W3C had an RDF description of itself somewhere, would that be 
a show stopper?

Let me summarise with a proposed resolution.

PROPOSED RESOLUTION: That the descriptor set and attribution elements in 
a POWDER document may contain RDF properties that have either a literal 
value or that use the rdf:resource attribute to point to an instance of 
owl:Thing only. Arbitrary RDF may not be included ()in XML terms the 
child elements of descriptorset and attribution must not have any child 
elements).

Arguments for:
  - The semantics are much safer
  - It means that at an operational level, it
    is possible to process POWDER without an RDF tool kit

Arguments against:
  - It reduces flexibility
  - It forces people to have or create a file that
    describes themselves (using FOAF or DC terms)

WDYT?

Phil.


[1] http://lists.w3.org/Archives/Public/public-powderwg/2008Jul/0058.html




-- 
Phil Archer
Chief Technical Officer,
Family Online Safety Institute
w. http://www.fosi.org/people/philarcher/
Received on Wednesday, 16 July 2008 12:07:13 UTC