- From: Michael Miller <Michael.Miller@systemsbiology.org>
- Date: Sat, 19 Jan 2013 11:23:07 -0800
- To: Andrea Splendiani <andrea.splendiani@deri.org>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, public-semweb-lifesci@w3.org
hi andrea, thanks. here's the info i found for neo4j limits [1]: "11.5.4. Data size In Neo4j, data size is mainly limited by the address space of the primary keys for Nodes, Relationships, Properties and RelationshipTypes. Currently, the address space is as follows: nodes: 2**35 (∼ 34 billion) relationships: 2**35 (∼ 34 billion) properties: 2**36 to 2**38 depending on property types (maximum ∼ 274 billion, always at least ∼ 68 billion) relationship types: 2**15 (∼ 32 000)" because the queries we make tend to go against different partitions of the graph we get a performance boost using sharding cheers, michael [1] http://docs.neo4j.org/chunked/snapshot/capabilities- capacity.html#capabilities-data > -----Original Message----- > From: Andrea Splendiani [mailto:andrea.splendiani@deri.org] > Sent: Saturday, January 19, 2013 4:24 AM > To: Michael Miller > Cc: Kingsley Idehen; public-semweb-lifesci@w3.org > Subject: Re: Facebook's new Graph Search: An endorsement of the RDF > approach to healthcare data? > > Hi, > > RDF/Triplestores and Neo4J can both be used as technologies to represent > graph structures (like p-p interactions). Neo4J may offer a slightly more > natural representation of edge attributes for some, but otherwise they both > can "hold graphs". > Than they are different tools. > If you go for queries, I think RDF/Triplestores have a edge. They are > naturally the technology to use if you want to make queries that span > different web-distributed resources. But even to query your own dataset, > SPARQL is pretty rich, and I guess more likely optimized for triplestore than > as a front-end to Neo4J (though that'ps a guess). > However, if you are into graph analysis, you may want to do lots of simple > calls to the graph (I'm thinking about some path analysis). Here sparql is too > heavy. It can be that some triplestore offer some native interfaces to > graphs, but I think Neo4J has an advantage in this case (it's more focused, > less overhead). > > Another thing to consider, last time I had a look at Neo4J I think it was > limited to 4B nodes, on a single instance machine. > > best, > Abdrea > > > Il giorno 18/gen/2013, alle ore 18:14, Michael Miller > <Michael.Miller@systemsbiology.org> ha scritto: > > > hi kingsley, > > > > neo4j is a nosql graph database with (my knowledge is limited so please > > forgive if i misspeak) attributes for nodes, including type, and > > attributes for edges. > > > > RDF is actually just triples, the syntax the RDF is expressed in is the > > notation and the data model is implicit, if i understand right, but can be > > captured by an ontology. you can only really express a 'subject-> > > predicate -> (object|primitive)' as a single triple but triples can be > > linked together by a common subject, which gives that subject multiple > > 'attributes' or by a common object and subject which allows traversal. > > > > a general graph allows a subject to have multiple predicates specified for > > it, which is the major difference from RDF. it also can represent a data > > model, ours certainly does with proteins, genes and drugs being some of > > the objects > > > > in fact i believe there is a fairly straight-forward translation between > > RDF and the more general graph. tinkerpop can go from RDF to neo4j > > amongst other graph databases [1]. there's also a great thread on > > performance tuning for loading triples [2] into neo4j. > > > > i didn't find much on general graphs to RDF but there is a fair amount of > > information for conceptual graphs to RDF [3]. > > > > i think what makes neo4j a better choice for us is that, for example, when > > a search is preformed, there will be a constraint on what type of node(s) > > and what type of edge(s) should be traversed. neo4j is very good at > > allowing us to make indices based on the type of edge or node. > > > > cheers, > > michael > > > > [1] http://java.dzone.com/news/rdf-data-neo4j-tinkerpop-story > > [2] > > > https://groups.google.com/forum/?fromgroups#!searchin/neo4j/rdf/neo4j > /g8bV > > 8w3LH9E/WIgx5GP14KAJ > > [3] > > > http://www.google.com/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&c > d=2&cad=r > > > ja&ved=0CEYQFjAB&url=http%3A%2F%2Fwww.lirmm.fr%2F~croitoru%2Frdf > s.pdf&ei=L > > Xr4UKmTPJDZigK22oDgDg&usg=AFQjCNGMzLXob8zCs0-j_85uFtR_a6Y26Q > > > >> -----Original Message----- > >> From: Kingsley Idehen [mailto:kidehen@openlinksw.com] > >> Sent: Thursday, January 17, 2013 1:38 PM > >> To: public-semweb-lifesci@w3.org > >> Subject: Re: Facebook's new Graph Search: An endorsement of the RDF > >> approach to healthcare data? > >> > >> On 1/17/13 1:45 PM, Michael Miller wrote: > >>> the developer who wrote the app looked at RDF but settled on neo4j > >> because > >>> it seemed to scale better. > >> RDF is a framework comprised of: > >> > >> 1. Data Model > >> 2. Syntax > >> 3. Notations. > >> > >> How do you compare that with an DBMS product? The comparison isn't > like > >> for like. > >> > >> -- > >> > >> Regards, > >> > >> Kingsley Idehen > >> Founder & CEO > >> OpenLink Software > >> Company Web: http://www.openlinksw.com > >> Personal Weblog: http://www.openlinksw.com/blog/~kidehen > >> Twitter/Identi.ca handle: @kidehen > >> Google+ Profile: https://plus.google.com/112399767740508618350/about > >> LinkedIn Profile: http://www.linkedin.com/in/kidehen > >> > >> > >> > >> > >
Received on Saturday, 19 January 2013 19:23:32 UTC