W3C home > Mailing lists > Public > semantic-web@w3.org > July 2012

Re: Why do we name nodes and not edges?

From: Ivan Mikhailov <imikhailov@openlinksw.com>
Date: Tue, 31 Jul 2012 14:34:19 +0700
Message-ID: <1343720059.17455.40.camel@octo.iv.dev.null>
To: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: Semantic Web <semantic-web@w3.org>
On Wed, 2012-07-25 at 17:07 +0200, Melvin Carvalho wrote:
> Why dont edges get the same treatment, ie encouragment to give it a
> (universal) name.  Is it even practical?

It is indeed practical in some special cases. In fact, this data model
is much older than the RDF. When LISP systems kept statements as ( P S O
) lists, every list had an address and thus it could be placed in S or O
position of other statement (putting it to P would be technically
possible as well, but I don't know any real example).

It's out of RDF mainstream due to the space and processing time costs
for extra data field and indexes on it. First, note two heuristics:

N times more index trees means N times slower run at a box that cost N
times more.

N times more statements in same number of index trees means log(N) times
slower run at a box that cost from log(N) to N times more.

Next, note that big applications need G as an additional field and ACID
properties. For a database, a reasonable coverage of G,S,P,O table with
indexes require at least 4 full or 3+2 "partial+full" index trees, but
adding fifth field would multiple the number of trees by factor 3 to 5,
not plain adding one more index. According to the mentioned heuristics,
it costs much more than multiplying the number of stored statements by 6
with storing extra
[] a Statement ; graph G ; subject S ; predicate P ; object O .

So there's no "scientific" or "philosophical" reason to keep edges not
named "by default", it's all about money. As a database vendor, we're
getting related questions from customers quite regularly, but no one
found the fifth column practical enough to write a feature request and
sign a contract.

The workaround for small systems is to keep G unique. "One triple per
graph" policy turns graph IRI into convenient edge IRI and the
application developer can use the existing infrastructure for free.

Best Regards,

Ivan Mikhailov
OpenLink Software
Received on Tuesday, 31 July 2012 07:34:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:48:38 UTC