- From: Leo Sauermann <leo@gnowsis.com>
- Date: Mon, 30 Aug 2004 10:33:28 +0200
- To: Eric Jain <Eric.Jain@isb-sib.ch>
- CC: www-rdf-interest@w3.org
- Message-ID: <4132E658.30908@gnowsis.com>
Hi all, Ok, I will bring this from textual discussion to real triple discussion. I hope this works.... > Leo Sauermann wrote: > >> reification syntax is "not practical" as we see above in the thread. > > > Please - as previously pointed out [1] quads are not always a suitable > replacement for reification. I do not think that the quote [1] does point out this. [1] just contains a theoretical assumption about "bloating" that I will falsify below. ok, the original quote was: [1]: http://lists.w3.org/Archives/Public/www-rdf-interest/2004Aug/0178.html >Consider the following example: > > s1 p1 o1 : backed by a1 and a2 > s1 p2 o2 : backed by a1 and a3 > >If we were to use contexts for expressing this, there would have to be >three different contexts (for statements backed a1, a2 and a3), and both >statements would have to be duplicated into two different contexts. >Correct? I imagine this approach would bloat the data far more than >normal reification would... I assume that bloating means "too many triples, they look ugly" Ok, I want to see If this assumption is true (theoretically it sounds good, but I want to move from theory to reality. This is not much work, actually) EXAMPLE 1 with reification a1 rdf:type example:source // just to have a triple a2 rdf:type example:source // just to have a triple a3 rdf:type example:source // just to have a triple st1 rdf:type rdf:Statement // spec needs this (or?) st1 rdf:subject s1 st1 rdf:predicate p1 st1 rdf:object o1 st2 rdf:type rdf:Statement // spec needs this (or?) st2 rdf:subject s1 st2 rdf:predicate p2 st2 rdf:object o2 st1 example:backedBy a1 st1 example:backedBy a2 st2 example:backedBy a1 st2 example:backedBy a3 == 15 not easily readable statements Example 2 with quads a1 rdf:type example:source // just to have a triple a2 rdf:type example:source // just to have a triple a3 rdf:type example:source // just to have a triple sta1 s1 p1 o1 sta2 s1 p1 o1 sta1 s1 p2 o2 sta3 s1 p2 o2 sta1 example:backedBy a1 sta2 example:backedBy a2 sta3 example:backedBy a3 == 10 easily readable triples So let us look at the assumption again: >I imagine this approach would bloat the data far more than >normal reification would... > I think that this assumption is not true when seen from the real life triples above. In theory, it sounds like bloating. But when you write down in triples (or quads) of the example, I think we have a different view. I think the quad triples are much better readable. don't tell me that "implementations hide these many triples away from me". No, they do not. When creating the triples, you have to use a reification API to create the triples, and when querying, you have to use the reification API again. and reification APIs demand you to code somethings. So the amount of triples above (10 vs 15) is *relative *to the amount of lines of code you have to write when using a RDF API. Another BIG advantage is deleting: == Deleting Problem === when I say: ok, I think s1 p1 o1 is not needed anymore, because context a1 falls away, I can run a "delete (all where reified by a1)" on my big gaph. but this will also delete the triple identified by a2. I actually had exactly this problem in the last week. Using Jena. thats why I started the thread, anyhow. So if you want to deny this problem, show me your code. ( I pasted mine here:) http://lists.w3.org/Archives/Public/www-rdf-interest/2004Aug/0255.html when i have quads I can say: "delete (* where quad has context X, X example:backedBy a1)" this will not delete triples from other quads. Reification APIs are not that strict. I think that there may be a mistake in my assumptions above and my conclusions, but I am quite sure that the above examples will run in RDF-Gateway. And I experienced the problems using Jena. I see this from real life implementors view: when I have to debug things, reification is not readable. When I code stuff, reification creates more triples and is harder to handle. So I would like to code using quads, but I miss the tools and APIs to do it (especially, I miss the Jena implementation of quads :-) As a detail on the side: Jena has quads somewhere. But the triple class does not have it, so it is not in the roots of jena. http://jena.sourceforge.net/javadoc/com/hp/hpl/jena/graph/Triple.html I think reification was a clear approach, It is totally triple based and gets 100% thumbs up for theoretical coolness in the RDF triple world. But my practical problems could be much easier solved with quads. :-) cheers Leo A note about the deleting problem: i usually keep all my triples from all different sources in one big graph, this is more real life: you want to search for informaiton that can be anywhere in your knowledge, you do not want to search different graphs. You want to have your query engine run on all data. Query engines do not usually do "over-more-than-one-graph" queries, and if they do, they may return other results than on the integrated data. and: YES it is possible to build an "aggregated graph" that may contain "thousands of graphs" but NO the indexing in database backed graphs does not work then, so please don't suggest this to me. My local database should be One-Big-Graph. containing triples with contexts/quads :-) That eases things.
Received on Monday, 30 August 2004 08:33:36 UTC