Re: Available approaches for keyword based querying RDF federations from Paul Houle on 2014-08-13 (public-lod@w3.org from August 2014)

From: Paul Houle <ontology2@gmail.com>
Date: Wed, 13 Aug 2014 10:49:56 -0400
To: Thilini Cooray <thilinicooray.ucsc@gmail.com>
Cc: Linked Data community <public-lod@w3.org>
Message-ID: <CAE__kdQVBv3P_GE7tcs64hG1do6sHy4nt3_+YJ_nsLkWgLg1jA@mail.gmail.com>

I would tend to stick up for a non-federated approach,  in the sense of
gathering a lot of federated data into a centralized knowledge base and
then querying that.  This is akin to how Google or Bing does web search by
crawling the web and forming a distributed index.

I can point to a number of reasons for this,  but some major ones are

* many of the better IR algorithms depend on corpus-wide statistics,  topic
modeling,  and other methods that need a global view (or at least a good
sample of a global view)
* even distributed search systems such as Solr (which in contrast to
federated search are well controlled because the machines are in the same
data center,  there is a deliberate approach to dealing with failures,
 etc.) are not terribly scalable for the following reason.  If you run
queries against N shards,  the time it takes to complete the query is
greater than the the maximum response time.  As N gets bigger the
probability that some glitch happens gets bigger and bigger.  Specifically
when N>10 it is pretty hard to maintain an acceptable response time for
interactive use.

I'd say practically "centralized" search engines like Google and Bing have
won the internet search war.  For various reasons, meta-search,  deep web
search and similar services haven't really caught on.

On Wed, Aug 13, 2014 at 8:12 AM, Thilini Cooray <
thilinicooray.ucsc@gmail.com> wrote:

> Hi,
>
> I would like to know available approaches for  keyword based querying RDF
> federations.
>
> I found the following approach :
> FedSearch: Efficiently Combining Structured Queries and Full-Text Search
> in a SPARQL Federation by
> Andriy Nikolov
> <http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22>,
> Andreas Schwarte
> <http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22>,
> Christian Hütter
> <http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22>
>
> I would like to know whether there are any other approaches.
>
> Regards,
> Thilini Cooray
>

-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype   ontology2@gmail.com

Received on Wednesday, 13 August 2014 14:50:28 UTC