RE: Revised draft of CBD from Karsten Otto on 2004-10-11 (www-rdf-interest@w3.org from October 2004)

From: Karsten Otto <otto@math.fu-berlin.de>
Date: Mon, 11 Oct 2004 16:11:38 +0200 (CEST)
To: Patrick.Stickler@nokia.com
cc: eric@w3.org, pfps@research.bell-labs.com, www-rdf-interest@w3.org
Message-ID: <Pine.LNX.4.61.0410111506220.27816@hobbes.mi.fu-berlin.de>
On Sun, 10 Oct 2004 Patrick.Stickler@nokia.com wrote:
>
> [...big cut...]
> Having CBDs as a default, but having both CBDs and SCBDs
> defined in a standardized way so that broad support for both can
> be encouraged and either can be easily requested, would be IMO
> a very good thing.
>
Agreed, but see below.

>
>> While this departs from the original CBD question of "tell me
>> about this
>> resource", I believe it to be a common enough case to deserve its own
>> "optimal alternative form". This Concise Bounded Usage
>> Description (CBUD)
>> can easily be dervied from the original CBD definition by exchanging
>> "subject" and "object", plus a minor modification regarding
>> reifications:
>>
>> 1. Include in the subgraph all statements where the *object* of the
>>     statement is the particular node in question;
>> 2. Recursively, for all statements identified in the subgraph thus far
>>     having a blank node *subject*, include in the subgraph
>> all statements
>>     where the *object* of the statement is the blank node in question
>>     and which are not already included in the subgraph.
>> 3. Recursively, for all statements included in the subgraph
>> thus far, for
>>     all reifications of each statement, include the *four RDF
>> reification
>>     statements* and the concise bounded *usage* description of the
>>     rdf:Statement node of each reification.
>>
>> (Note that one could also construct symmetric and inverse functional
>> variants in a similar way when needed.)
>
> If I understand this correctly, a CBUD would be a subset of a SCBD,
> right?
>
Not really. A SCBD is basically a CDB, additionally listing the inbound
arcs for nodes that have any. As you point out in the CDB document, this
is to a maximum depth of 1.

In contrast, a CBUD is a sort of reverse CDB: Where the CDB follows 
outbound arcs, the CBUD does the same for inbound arcs (including
reifications of these). Naturally this will result in depth greater 
than 1.

> I don't suppose you could provide an example, per the node
> http://example.com/aReallyGreatBook in the source
> graph provided in the latest CBD document:
>
> http://swdev.nokia.com/uriqa/CBD.html#sourcegraph
>
> ???
>

This source graph is not a good example; the result of the CBUD algorithm 
is indeed only a subset of the SCBD. However, to illustrate the
difference, lets add the following to the source graph:

<!-- indirect reference via an anonymous node -->
<rdf:Description rdf:about="http://example.com/anotherBookCritic">
    <ex:rates>
       <rdf:Description rdf:nodeID="A0">
         <ex:thumbs>5</ex:thumbs>
         <rdf:value rdf:resource="http://example.com/aReallyGreatBook"/>
       </rdf:Description>
    </ex:rates>
</rdf:Description>

<!-- reification of an inbound arc -->
<rdf:Description rdf:about="http://example.com/aReviewMagazine">
    <ex:covers>
       <rdf:Statement>
         <rdf:subject rdf:resource="http://example.com/anotherBookCritic"/>
         <rdf:predicate rdf:resource="http://example.com/rates"/>
         <rdf:object rdf:nodeID="A0"/>
       </rdf:Statement>
    </ex:covers>
</rdf:Description>

Now the CBUD is clearly different, as it also discovers the
relationships introduced by the addition above:

<!-- found by CBUD and SCBD -->
<rdf:Description rdf:about="http://example.com/anotherGreatBook">
   <rdfs:seeAlso rdf:resource="http://example.com/aReallyGreatBook"/>
</rdf:Description>
<rdf:Description rdf:about="http://example.com/aBookCritic">
   <ex:likes rdf:resource="http://example.com/aReallyGreatBook"/>
</rdf:Description>

<!-- found by CBUD only -->
<rdf:Description rdf:about="http://example.com/anotherBookCritic">
   <ex:rates>
     <rdf:Description rdf:nodeID="A0">
       <rdf:value rdf:resource="http://example.com/aReallyGreatBook"/>
     <rdf:Description>
   </ex:rates>
</rdf:Description>
<rdf:Description rdf:about="http://example.com/aReviewMagazine">
   <ex:covers>
     <rdf:Statement>
       <rdf:subject rdf:resource="http://example.com/anotherBookCritic"/>
       <rdf:predicate rdf:resource="http://example.com/rates"/>
       <rdf:object rdf:nodeID="A0"/>
     </rdf:Statement>
   </ex:covers>
</rdf:Description>

Note that in contast to the second part, SCBD would only include this
rather unhelpful fact:

<rdf:Description rdf:nodeID="A0">
   <rdf:value rdf:resource="http://example.com/aReallyGreatBook"/>
</rdf:Description>

While technically correct, this "dangling arc" is only part of the
real usage relationship.

>
>> I find this definition useful for a number of reasons:
>> [...because it finds things like the extra information above...]
>
> Fair enough. Though note that this would make a CBUD incompatible
> with the URIQA interface (the same for IFCBDs) and thus, while
> offering clear utility, impose greater implementational requirements
> and communication overhead than either CBDs or SCBDs, so if a CBUD
> is a subset of a SCBD, would a SCBD suffice?
>
Yes, support for CBUD and other "alternative forms" would require
yet another HTTP header or method. But as you pointed out,
"what is this resource?" and "who uses it?" are two distinct
questions; IMHO that should be reflected by the query infrastructure.
Btw, do you have anything planned for URIQA in this direction?

> And if a CBUD is not a subset of a SCBD, per the present definition
> of an SCBD, it may be reasonable to modify the definition of an SCBD
> accordingly to address your use case while still allowing effective
> use of the minimal URIQA interface.
>
I believe it should be possible to modify the SCBD definition to
avoid "dangling arcs". However, I feel this would blur the description
of the resource that was asked for: Many of the information related
to inbound arc chains are actually descriptions of other resources.
Also, the size of the returned graph could become unwieldy if the
source graph is very tangled and/or contains a large number of anonymous 
nodes (FOAF?).

Regards,
Karsten Otto
Received on Monday, 11 October 2004 14:11:50 UTC