Re: triple Indexing for Apps like Cimba from Brent Shambaugh on 2015-01-20 (public-lod@w3.org from January 2015)

From: Brent Shambaugh <brent.shambaugh@gmail.com>
Date: Mon, 19 Jan 2015 21:09:42 -0600
To: Timothy Holborn <timothy.holborn@gmail.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <CACvcBVrZv35SvEPsrkCUD1+M6FhwmEOSC8M0shV0o4+N1Q_t7w@mail.gmail.com>

Tim,

I think I ran across a couple of papers about this.  Maybe this was it?
Basically, semantics actually help DHT as they are kind of naiive. Hoping
to get back to this.

Loser, Alexander et al., Semantic Social Overlay Networks, IEEE Journal on
Selected Areas in Communication, Vol. 25, No. 1, January 2007,
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.7668&rep=rep1&type=pdf

-Brent Shambaugh

Website: bshambaugh.org

On Sun, Jan 18, 2015 at 3:20 PM, Timothy Holborn <timothy.holborn@gmail.com>
wrote:

> Interesting..
>
> I ponder the use of DHT perhaps, yet not sure about the likely size...
>
> Webizen is a service[0]/repo[1]
>
> Assuming RWW Clustering Accounts (ie: provider / subdomains, et.al),
> perhaps the base-install uses a look-up service, which is pointed, like a
> time-server...?  no-point decentralising on an account level.
>
> Equally, one might consider that the server would index it's own record,
> and perhaps a relationship graph out to an  var. int.
>
> Melvin's been dealing with decentralised block-chain storage.  I imagine
> this is a similar challenge.
>
> [0] http://webizen.org/
> [1] https://github.com/linkeddata/webizen
>
> Tim.H.
>
> On 19 January 2015 at 04:18, Brent Shambaugh <brent.shambaugh@gmail.com>
> wrote:
>
>> Andrei (and others in the reply all?),
>>
>> Last year you gave a talk about cimba.co at MIT. During the Q&A there
>> was some discussion about what sort of index or triple retrieval mechanism
>> there would be. Sandro Hawke put up the talk, which I linked to here [0]. I
>> was wondering if you came up with something.
>>
>> Thanks for your time.
>>
>> My thoughts:
>>
>> From what I have read, it is difficult to index everything. The best you
>> can do is index triples that are "important"that will eventually lead you
>> to less important triples that you might want.
>>
>> Perhaps this is accomplished by some form of semantic clustering? Perhaps
>> this clustering is accomplished by some sort of distributed RDF store, such
>> as Swarm Linda [1]. Or perhaps this clustering is accomplished by only
>> indexing the names of linked data containers with some sort of description
>> about what they are about. Or perhaps, collections, which seem to have less
>> structure defined about what they are about and can exist (iirc) at
>> multiple Network nodes with different ownership, are described in some way
>> and cleaned up to be more query able using swarm intelligence provided by
>> Swarm Linda, or something similar like building a Folksonomy with Twitter
>> tags [2]. I might need to compare these more, but it seems you are looking
>> at semantic and syntactic similarities where the semantic similarities need
>> some sort of global reference to make things more manageable/possible.
>> For the index you either need some sort of centralized index or
>> decentralized index. If being a purist in decentralization is desired even
>> YaCy won't do since there are 4 nodes that are not decentralized [3]. Not
>> knowing much, there may be times when you want a centralized index. Perhaps
>> P2P would introduce too much latency and use too much bandwidth in the
>> network. Perhaps sometimes you want P2P because you are constructing a Mesh
>> Network where you might even want local versions of some ontologies because
>> you are closed off for some reason.
>> [0]
>> http://adistributedeconomy.blogspot.com/2014/12/links-to-building-social-applications.html?m=1
>> [1]
>> http://www.mi.fu-berlin.de/inf/publications/techreports/tr2009/B-09-04/TR-B-09-04.pdf?1346662692
>> [2]
>> http://people.kmi.open.ac.uk/motta/papers/SpeciaMotta_ESWC-2007_Final.pdf
>> [3] https://fedcsis.org/proceedings/2011/pliks/237.pdf
>>
>>
>>
>>
>

Received on Tuesday, 20 January 2015 03:10:10 UTC