advantages of statement-level vs graph-level reification from _@whats-your.name on 2008-03-04 (semantic-web@w3.org from March 2008)

From: <_@whats-your.name>
Date: Tue, 4 Mar 2008 14:05:08 -0500
To: semantic-web@w3.org
Message-ID: <20080304190508.GA28261@m>
On Tue Mar 04, 2008 at 07:33:43AM -0500, cdr wrote:


* optimization -  tradeoffs have to be made as far as what can be O(1) vs what is O(log n) and so on, the general solution to efficient triple-graph contexts is to eliminate the instantiation overhead of triple-graph engines by having one quad-database engine. triple-databases are simpler and lighter than quad-databases, (and triples are what we've begun to say we can reason about and agreed theyre the simplest flexible building block) (and if mass adoption is desired, it has to be as simple as possible. not require beasts like Virtuoso or Sesame to even dip your toes in the water). tiding the explosion of triple-graph contexts is not realizable as the ability to even timestamp when the statement was made begets another graph. 

* indirection - related to optimization. with quads you have to go through another graph to talk about another statement (and if you want to be sure youre talking about just that statement, you have to give in and figure out how to give it a URI anyways, but the graph overhead isnt going away)

* clean naming, immutable values. a key tenet to the lambda calculus and pure functional programming. 3 is always 3. 3+4 is always 7. someFunc(3,OneOfThese(1,2,3)) always returns the same value. 
relating this to URIs. a literal is immutable. literal "asd" always SHA1's to f10e2821bbbea527ea02200352313bc059445190. you can even refer to it as <data:hash:sha160:f10..> (eg, a URI). and know that the value is never going to change, so long as the function (to convert between literal and URI) doesn't. likewise, a triple can be referred to as <[uri1][uri2][uri3]> (another URI). all immutable and undenyable what they are. you know exactly what youre referring to, even if you have to deconstruct a few things to be sure (voila, FP languages have pattern matching).

how do you come up with the name for your reification graph? do you just generate something random? maybe pick something like <urn:john/statements_From_2007-08-81@http//www.johnssite.com>? since graphs are so much more complex than statements, its unlikely we'll ever agree on a way to hash them down to something exchangeable, opening the floodgates for infinite ad-hoc inventions, varying from app to app, site to site..

* impervious to side-effects. when im making a statement about <[u1],[u2],[u3]>, i know what im making a statement about. if required to use a graph, theres a new requirement (on top of the additional overhead) - that graph must never change, otherwise statements made about it may be rendered invalid (the alternative, checking all the previous statements about this graph for continued validity on updates, we can agree is unreasonable to expect?). theres on risk of the rug being pulled out from under us (if sha1 is modified, it will likely get a new name, etc, since its such a fundamental assumption). i think its a fundamental assumption of modern web apps and the web at large that graphs are always changing, and cant be expected to be what they were when we said something about them. fighting a tide of inevitability does not look like a sound practice..


so Richard, what are your arguments?
Received on Tuesday, 4 March 2008 19:05:28 UTC