Re: Reification - whats best practice? from Eric Jain on 2004-08-30 (www-rdf-interest@w3.org from August 2004)

From: Eric Jain <Eric.Jain@isb-sib.ch>
Date: Mon, 30 Aug 2004 12:26:19 +0200
To: Leo Sauermann <leo@gnowsis.com>
CC: www-rdf-interest@w3.org
Message-ID: <413300CB.8080402@isb-sib.ch>
Leo Sauermann wrote:
> I assume that bloating means "too many triples, they look ugly"

Too many triples, yes, but more important, too much data and 
duplication. Not necessarily the same thing.


> EXAMPLE 1 with reification
 > ...
> == 15 not easily readable statements 

You could even argue that it's 17, as reifying a statement doesn't 
replace it. But keep in mind that it is trivial to collapse reified 
statements into a more compact "quad" representation:

sta1 s1 p1 o1
sta1 example:backedBy a1
sta1 example:backedBy a2
sta2 s1 p2 o2
sta2 example:backedBy a1
sta2 example:backedBy a3
a1 rdf:type example:source
a2 rdf:type example:source
a3 rdf:type example:source

Actually, I treat triples as "quintuplets". Each statement is part of a 
model, or named graph:

m1 sta1 s1 p1 o1
m1 sta1 example:backedBy a1
m1 sta1 example:backedBy a2
m1 sta2 s1 p2 o2
m1 sta2 example:backedBy a1
m1 sta2 example:backedBy a3
m1 a1 rdf:type example:source
m1 a2 rdf:type example:source
m1 a3 rdf:type example:source

This is important for inserting, retrieving and deleting data, which we 
do not manage at the level of individual statements or resources. Who does?


> Example 2 with quads
>  ...
> sta1 s1 p1 o1
> sta2 s1 p1 o1
> sta1 s1 p2 o2
> sta3 s1 p2 o2
 > ...
> == 10 easily readable triples

Having to reassert a statement just because two or more people happen to 
agree an a fact doesn't seem very intuitive, unless you happen to manage 
your data according to who-said-it, which we don't.

You could argue that your quads could be represented as:

s1 p1 o1 [sta1, sta2]
s1 p2 o2 [sta1, sta3]
sta1 example:backedBy a1
sta2 example:backedBy a2
sta3 example:backedBy a3

But I suspect this is less straightforward to implement than:

s1 p1 o1 [sta1]
s1 p2 o2 [sta2]
sta1 example:backedBy a1
sta1 example:backedBy a2
sta2 example:backedBy a1
sta2 example:backedBy a3

Which of course is nothing but reification in disguise.

In some kind of shorthand syntax suitable for editing, this could even 
be represented as:

s1 p1 o1 [sta1]
s1 p2 o2 [sta2]
sta1 example:backedBy a1, a2
sta2 example:backedBy a1, a3

Now it's shorter than your example :-)

Also, this approach allows you to define :backedBy as the owl:inverseOf 
:says, and look for people who agree with you:

SELECT
   ?author
WHERE
   [:Me :says ?something] AND
   [?author :says ?something]

Unfortunately this query tends to return an empty set with real life 
data :-)


> don't tell me that "implementations hide these many triples away from me".
> No, they do not. When creating the triples, you have to use a 
> reification API to create the triples, and when querying, you have to 
> use the reification API again. and reification APIs demand you to code 
> somethings.

Admittedly many tools don't hide the reification triples (as far as I am 
concerned, they shouldn't even use them internally). But I'd rather have 
them fix this before they start implementing extensions not found in any 
standard...


> But my practical problems could be much easier solved with quads. :-)

That may be so, but please keep in mind that other people may have 
different practical problems that are much easier solved with reification.


> Query engines do not usually do "over-more-than-one-graph" queries

I have implemented this, so others can't be far behind :-)

As pointed out previously, I do not consider a "graph" a replacement for 
reification, but a mechanism for data management, about on the same 
level as files.
Received on Monday, 30 August 2004 10:26:20 UTC