Re: Reification: was Re: ANNOUNCE: W3C Workshop on Semantic Web for Life Sciences from Eric Jain on 2004-08-20 (public-semweb-lifesci@w3.org from August 2004)

From: Eric Jain <Eric.Jain@isb-sib.ch>
Date: Fri, 20 Aug 2004 18:40:54 +0200
To: Dave Reynolds <der@hplb.hpl.hp.com>
CC: public-semweb-lifesci@w3.org
Message-ID: <41262996.9000800@isb-sib.ch>

Dave Reynolds wrote:
> You could represent this by explicitly reifying the observation rather 
> than the RDF statement. For example, have an ObservationEvent class, 
> instances of which indicate the protein, the organism, the observer, the 
> citation etc directly. This could sit alongside rather than replace a 
> direct link between protein and organism so that the presence or absence 
> of such information does not change the navigation structure.
> 
> This doesn't really save anything compared to attaching properties to a 
> reified statement but it is another option.

If I am not mistaken, when a statement is reified it must be retained 
anyway, if the exact sematics of the data is to be maintained.

The big difference, from my point of view, is between working with 
something like:

   P12345 organism taxonomy:9606
   S1 rdf:type rdf:Statement
   S1 rdf:subject P12345
   S1 rdf:predicate organism
   S1 rdf:object 'Foo'
   S1 created '2004-08-20'

and

   P12345 organism taxonomy:9606 [S1]
   S1 created '2004-08-20'

Former just doesn't seem practical, especially if data is not only being 
read but also modified.

As you suggest, I sometimes use custom classes rather than 
rdf:Statement, though I always make them subclasses of rdf:Statement. 
Unfortunately this is not an owl:Class, which introduces some 
limitations (e.g. can't define the class as complete).

   <rdf:Description rdf:about="P12345">
     <name rdf:ID="S1">Foo</name>
   </rdf:Description>

   <rdf:Description rdf:about="#S1">
     <rdf:type rdf:resource="Statement_With_Evidence"/>
     <evidence rdf:resource="..."/>
   </rdf:Description>


> In Jena we tried to get some way towards the efficiency and simplicity 
> of the latter while still supporting the standard. The stores can (and 
> in the RDB case, do) store the quad of statements compactly. The API 
> allows you to optionally hide the reification quads so that your model 
> isn't cluttered up with the extra statements yet you can still use the 
> reification API to get from a Statement to the resource representing its 
> reification.

The approach you use for relational storage seems reasonable, I wonder 
why you don't use the same approach for the in-memory model? After all, 
disk space is chaeper then memory :-)


> This is a complex part of the implementation to maintain and appears to 
> be so little used in practice that there is a proposal to deprecate it 
> [1]. Perhaps your experience suggests we should consider at least 
> postponing the deprecation a little longer in case someone starts to 
> need this functionality.

It would be a pitty if you dropped this, though it would further confirm 
point 3 in my position paper :-(

Received on Friday, 20 August 2004 16:42:00 UTC