Re: Why do we name nodes and not edges? from adasal on 2012-08-01 (semantic-web@w3.org from August 2012)

From: adasal <adam.saltiel@gmail.com>
Date: Wed, 1 Aug 2012 16:38:03 +0100
To: Stephen Williams <sdw@lig.net>
Cc: semantic-web@w3.org
Message-ID: <CANJ1O4ohY1W=o=3WM-S+fGm+UheHBWzKKUspAgcowc+z_4EegA@mail.gmail.com>
Stephen Williams wrote:-


> Please let me know if you are interested in exploring the idea and helping
> to implement this in one way or another.  In particular, I need (and may
> create) a SQLite-like licensed library (Apache 2, MIT, or a commercial
> license with few restrictions, etc.) that can be used widely without
> restriction.  Which may of course just be a layering on SQLite initially,
> although that likely won't be efficient and scalable enough for my purposes.


I am curious,  would the following graph data base help? Or Neo4J?

Or, at least a good place to start? Notice the restriction on size using
Berkley backend, but no necessary tie to this, I think.

Thanks to all for the discusion, very helpful.
I am going to use it to help me explain the relationship between graphs and
topology in a unrelated, non-computer, domain.

Adam

+++++++++++++++++++++++++++++++++++++++++++++

(Taken from http://www.hypergraphdb.org/index)

What Is It?
HyperGraphDB is a general purpose, open-source data storage mechanism based
on a powerful knowledge management formalism known as directed hypergraphs.
While a persistent memory model designed mostly for knowledge management,
AI and semantic web projects, it can also be used as an embedded
object-oriented database for Java projects of all sizes. Or a graph
database. Or a (non-SQL) relational database.
...
...
Feature Summary

Powerful data modeling and knowledge representation.
Graph-oriented storage.
N-ary, higher order relationships (edges) between graph nodes.
Graph traversals and relational-style queries.
Customizable indexing.
Customizable storage management.
Extensible, dynamic DB schema through custom typing.
Out of the box Java OO database.
Fully transactional and multi-threaded, MVCC/STM.
P2P framework for data distribution.

(and
http://www.hypergraphdb.org/blog?entry=http://www.blogger.com/feeds/1980461574999551012/posts/default/3388327883345778567
)

HyperGraphDB 1.2 Beta now available

(news, hypergraphdb published on June 11, 2012)

Kobrix Software is pleased to announce the release of HyperGraphDB version
1.2.

HyperGraphDB is a general purpose, free open-source data storage mechanism.
Geared toward modern applications with complex and evolving domain models,
it is suitable for semantic web, artificial intelligence, social networking
or regular object-oriented business applications.

This release contains numerous bug fixes and improvements over the previous
1.1 release. A fairly complete list of changes can be found at the Changes
for HyperGraphDB, Release 1.2 wiki page.

Introduction of a new HyperNode interface together with several
implementations, including subgraphs and access to remote database peers.
The ideas behind are documented in the blog post HyperNodes Are Contexts.
Introduction of a new interface HGTypeSchema and generalized mappings
between arbitrary URIs and HyperGraphDB types.
Implementation of storage based on the BerkeleyDB Java Edition (many thanks
to Alain Picard and Sebastian Graf!). This version of BerkeleyDB doesn't
require native libraries, which makes it easier to deploy and, in addition,
performs better for smaller datasets (under 2-3 million atoms).
Implementation of parametarized pre-compiled queries for improved query
performance. This is documented in the Variables in HyperGraphDB Queries
blog post.

HyperGraphDB is a Java based product built on top of the Berkeley DB
storage library.

Key Features of HyperGraphDB include:

Powerful data modeling and knowledge representation.
Graph-oriented storage.
N-ary, higher order relationships (edges) between graph nodes.
Graph traversals and relational-style queries.
Customizable indexing.
Customizable storage management.
Extensible, dynamic DB schema through custom typing.
Out of the box Java OO database.
Fully transactional and multi-threaded, MVCC/STM.
P2P framework for data distribution.

In addition, the project includes several practical domain specific
components for semantic web, reasoning and natural language processing. For
more information, documentation and downloads, please visit the
HyperGraphDB Home Page.
On 30 July 2012 18:47, Stephen Williams <sdw@lig.net> wrote:

>  On 7/29/12 6:09 AM, Nathan wrote:
>
> David Booth wrote:
>
> Another approach (instead of reification, which I personally hate), is
> to use named graphs.  Named graph have to be used differently, but can
> often solve the same use case.
>
> For RDF stores that store everything as quads anyway, my guess is that
> even if you have only one named graph per triple it would likely involve
> less overhead than reification, but perhaps one or more of the
> developers of such stores can comment on that more authoritatively.
>
>
> As I understand it, Melvin is looking for a well defined function that
> would allow one to canonicalize a triple (edge) in to a unique URI. Such
> that f(subject, predicate, object) = edge:123234234 .
>
> Reification allows you to name a triple, but it's not in a canonical form
> with a unique name per triple.
>
>
> At at W3C plenary at MIT several years ago, I asked TBL why triples and
> not quads.  To which he replied, they are quads: the forth element is just
> usually implied (or something close to that).
>
> I've long thought that we need unique identification of each triple and to
> be able to uniquely group arbitrary subsets of statements in a "triple
> store" so that the subset can be referred to easily.  My solution is to
> represent "triples" as pents: triple+ID+context, where context is very
> general purpose and semi-automatically maintained.  Going further, I am
> mostly convinced that it should be a "hex" with two kinds of context:
> provenance / certainty (time stamps, source, several types of trust) and
> statement subset association.  (There is one further level needed in my
> system, but I won't go into that here yet.)  I need to implement this soon
> and have a number of ideas about how this should work to be efficient and
> scalable.
>
> Please let me know if you are interested in exploring the idea and helping
> to implement this in one way or another.  In particular, I need (and may
> create) a SQLite-like licensed library (Apache 2, MIT, or a commercial
> license with few restrictions, etc.) that can be used widely without
> restriction.  Which may of course just be a layering on SQLite initially,
> although that likely won't be efficient and scalable enough for my purposes.
>
> With current standards, this would be externalized as reified RDF if
> "everything" were exported, or simple triples if the metadata is elided.
> Probably a new twist on external representation would be useful.
> Additionally, based on my work related to W3C EXI and my own binary XML
> work, I have had a number of ideas related to a binary RDF/pent/hex/ntuple
> interchange format.  This is also something I'm going to need soon.
>
> Named graphs are the beginnings of how to do this, and everything could be
> done through the fourth term in a quad.  However, this is likely to be
> cumbersome and I don't see current implementations actually solving the
> problem properly yet.
>
>
>
> In logic we assign symbols to statements all the time (~A & B), but not in
> a well defined way where each unique statement has exactly one canonical
> name.
>
> An interesting question, is whether two identical triples (edges) from
> different documents would share the same canonicalized form, or whether the
> provenance / named graph would need to be part of the canonicalization.
> More of a f(subject, predicate, object, graph) = <edge:graph#123wer234d23>
> where 123wer234d23 is a hash(subject, predicate, object).
>
>
> This is one good solution.  Another, applicable sometimes, is to just have
> serial numbers relative to some database.  One semantic web idiom is that
> the only unambiguous reference to a triple or set of triples is a complete
> restatement of those triples.  It is basically the same however to define a
> temporary term in a local context like A = {set of triples}, then make
> statements about A.  An externalized set should be able to do that and even
> reference a subset in a database elsewhere.
>
>
>
> One use case of for this (from Melvin) would be to apply weights to
> statements: { X :magnitude 10 } where X is a uri which identifies the
> statement { :Bob :trusts :Mary } .
>
>
> There are many cases where you need to describe provenance,
> trust/probability, and make statements about groups of statements.  It
> shouldn't be so hard or confusing.
>
>
> Best,
>
> Nathan
>
>
> sdw
>
>
Received on Wednesday, 1 August 2012 15:38:31 UTC