Re: DBpedia hosting burden

Dan,

> Are there any scenarios around eg. BitTorrent that could be explored?
> What if each of the static files in http://dbpedia.org/sitemap.xml
> were available as torrents (or magnet: URIs)? I realise that would
> only address part of the problem/cost, but it's a widely used
> technology for distributing large files; can we bend it to our needs?

If I were The Emperor of LOD I'd ask all grand dukes of datasources to
put fresh dumps at some torrent with control of UL/DL ratio :) For
reason I can't understand this idea is proposed few times per year but
never tried.

Other approach is to implement scalable and safe patch/diff on RDF
graphs plus subscription on them. That's what I'm writing ATM. Using
this toolkit, it would be quite cheap to place a local copy of LOD on
any appropriate box in any workgroup. A local copy will not require any
hi-end equipment for two reasons: the database can be much smaller than
the public one (one may install only a subset of LOD) and it will
usually less sensitive to RAM/disk ratio (small number of clients will
result in better locality because any given individual tend to browse
interrelated data whereas a crowd produces chaotic sequence of
requests). Crawlers and mobile apps will not migrate to local copies,
but some complicated queries will go away from the bottleneck server and
that would be good enough.

Best Regards,

Ivan Mikhailov
OpenLink Software
http://virtuoso.openlinksw.com

Received on Wednesday, 14 April 2010 18:49:55 UTC