Re: deterministic naming of blank nodes

On 2015-05-13, ahogan@dcc.uchile.cl wrote:

> One drawback of the current approach is that at the moment you have to 
> load the RDF graph into memory, which may be prohibitive for very very 
> large graphs (of connected blank nodes), but I'm not sure how often 
> that happens in practice. (It wasn't a problem for any of the graphs 
> in the BTC-14 dataset for example.)

And, once you factor in the presence of named nodes whose names do 
encode a whole lot of connectivity information as well, in theory you 
ought to be able to add in buffer tree style thinking over the lexical 
sort of the named nodes, combined with rough, incremental topological 
sorting of the blank nodes hanging off the named ones. That way a little 
bit of thought ought to yield you an algorithm which is both cache local 
and distributable enough to handle practical graphs with sizes into the 
current big data territory.

> https://github.com/aidhog/blabel/wiki

Nice stuff.
-- 
Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Received on Wednesday, 13 May 2015 19:02:59 UTC