- From: Dan Brickley <danbri@danbri.org>
- Date: Wed, 14 Apr 2010 21:04:54 +0200
- To: public-lod <public-lod@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
On Wed, Apr 14, 2010 at 8:11 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote: > Some have cleaned up their act for sure. > > Problem is, there are others doing the same thing, who then complain about > the instance in very generic fashion. They're lucky it exists at all. I'd refer them to this Louis CK sketch - http://videosift.com/video/Louie-CK-on-Conan-Oct-1st-2008?fromdupe=We-live-in-an-amazing-amazing-world-and-we-complain (if it stays online...). >> While it is a >> shame to say 'no' to people trying to use linked data, this would be >> more saying 'yes, but not like that...'. >> > > I think we have an outstanding blog post / technical note about the DBpedia > instance that hasn't been published (possibly due to the 3.5 and > DBpedia-Live work we are doing), said note will cover how to work with the > instance etc.. [..] > We do have a solution in mind, basically, we are going to have a different > place for the descriptor resources and redirect crawlers there via 303's > etc.. [...] > We'll get the guide out. That sounds useful >> As you mention, DBpedia is an important and central resource, thanks >> both to the work of the Wikipedia community, and those in the DBpedia >> project who enrich and make available all that information. It's >> therefore important that the SemWeb / Linked Data community takes care >> to remember that these things don't come for free, that bills need >> paying and that de-referencing is a privilege not a right. > > "Bills" the major operative word in a world where the "Bill Payer" and > "Database Maintainer" is a footnote (at best) re. perception of what > constitutes the DBpedia Project. Yes, I'm sure some are thoughtless and take it for granted; but also that others are well aware of the burdens. (For that matter, I'm not myself so sure how Wikipedia cover their costs or what their longer-term plan is...). > For us, the most important thing is perspective. DBpedia is another space on > a public network, thus it can't magically rewrite the underlying physics of > wide area networking where access is open to the world. Thus, we can make a > note about proper behavior and explain how we protect the instance such that > everyone has a chance of using it (rather than a select few resource > guzzlers). This I think is something others can help with, when presenting LOD and related concepts: to encourage good habits that spread the cost of keeping this great dataset globally available. So all those making slides, tutorials, blog posts or software tools have a role to play here. >> Are there any scenarios around eg. BitTorrent that could be explored? >> What if each of the static files in http://dbpedia.org/sitemap.xml >> were available as torrents (or magnet: URIs)? > > When we set up the Descriptor Resource host, these would certainly be > considered. Ok, let's take care to explore that then; it would probably help others too. There must be dozens of companies and research organizations who could put some bandwidth resources into this, if only there was a short guide to setting up a GUI-less bittorrent tool and configuring it appropriately. Are there any bittorrent experts on these mailing lists who could suggest next practical steps here (not necessarily dbpedia-specific)? (ah I see a reply from Ivan; copying it in here...) > If I were The Emperor of LOD I'd ask all grand dukes of datasources to > put fresh dumps at some torrent with control of UL/DL ratio :) For > reason I can't understand this idea is proposed few times per year but > never tried. I suspect BitTorrent is in some ways somehow 'taboo' technology, since it is most famous for being used to distributed materials that copyright-owners often don't want distributed. I have no detailed idea how torrent files are made, how trackers work, etc. I started poking around magnet: a bit recently but haven't got a sense for how solid that work is yet. Could a simple Wiki page be used for sharing torrents? (plus published hash of files elsewhere for integrity checks). What would it take to get started? Perhaps if http://wiki.dbpedia.org/Downloads35 had the sha1 for each download published (rdfa?), then others could experiment with torrents and downloaders could cross-check against an authoritative description of the file from dbpedia? >> I realise that would >> only address part of the problem/cost, but it's a widely used >> technology for distributing large files; can we bend it to our needs? >> > > Also, we encourage use of gzip over HTTP :-) Are there any RDF toolkits in need of a patch to their default setup in this regard? Tutorials that need fixing, etc? cheers, Dan ps. re big datasets, Library of Congress apparently are going to have complete twitter archive - see http://twitter.com/librarycongress/status/12172217971 -> http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-twitter-archive/
Received on Wednesday, 14 April 2010 19:05:31 UTC