- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Mon, 23 Mar 2009 23:06:43 -0400
- To: Steve Judkins <steve@wisdomnets.com>
- CC: 'Hugh Glaser' <hg@ecs.soton.ac.uk>, public-lod@w3.org
Steve Judkins wrote: > I found Medline to have a pretty nice model for this. Every so often they > ship a full DB dump in XML as chunked zip files (not more than a 1Gb each if > I remember). Subscribers just synchronize the FTP directories between the > Medline server and local server. After that you can process daily diff > dumps. The downloads were just XML with a stream of record URIs with an > Add/Modify/Delete attribute, and the data fields that changed. A well > known graph where you can look for changes to the LOD datasources you care > about, and get SIOC markup for this that describes the Items, Date, and > Agents/People doing the modifications. This is a great use case for the > FOAF+SSL & OAuth because you may only automatically process updates from > Agents you trust (e.g. Wikipedia might only take changes from DBPedia). > Steve, You're very much on the ball here, this is very much the kind of thing foaf+ssl [1] is about :-) I was going to unveil similar capabilities re. DBpedia endpoint down the line i.e. SPARQL endpoint behavior aligned to trusted identities etc.. Links: 1. http://esw.w3.org/topic/foaf+ssl - FOAF+SSL Kingsley > -Steve > > -----Original Message----- > From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On Behalf > Of Kingsley Idehen > Sent: Monday, March 23, 2009 3:34 PM > To: Steve Judkins > Cc: 'Hugh Glaser'; public-lod@w3.org > Subject: Re: Potential Home for LOD Data Sets > > Steve Judkins wrote: > >> It seems like this has the potential to become a nice collaborative >> production pipeline. It would be nice to have a feed for data updates, so >> > we > >> can fire up our EC2 instance when the data has been processed and packaged >> by the providers we are interested in. For example, if Openlink wants to >> fire up their AMI to processes the raw dumps from >> http://wiki.dbpedia.org/Downloads32 into this cloud storage, we can wait >> until a virtuoso ready package has been produced before we update. As >> > more > >> agents get involved in processing the data, this will allow for more >> automation notifications of updated dumps or SPARQL endpoints. >> >> > Yes, certainly. > > Kingsley > >> -Steve >> >> -----Original Message----- >> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On >> > Behalf > >> Of Kingsley Idehen >> Sent: Thursday, December 04, 2008 9:20 PM >> To: Hugh Glaser >> Cc: public-lod@w3.org >> Subject: Re: Potential Home for LOD Data Sets >> >> >> Hugh Glaser wrote: >> >> >>> Thanks for the swift response! >>> I'm still puzzled - sorry to be slow. >>> http://aws.amazon.com/publicdatasets/#2 >>> Says: >>> Amazon EC2 customers can access this data by creating their own personal >>> >>> >> Amazon EBS volumes, using the public data set snapshots as a starting >> > point. > >> They can then access, modify and perform computation on these volumes >> directly using their Amazon EC2 instances and just pay for the compute and >> storage resources that they use. >> >> >>> >>> Does this not mean it costs me money on my EC2 account? Or is there some >>> >>> >> other way of accessing the data? Or am I looking at the wrong bit? >> >> >>> >>> >>> >> Okay, I see what I overlooked: the cost of paying for an AMI that mounts >> these EBS volumes, even though Amazon is charging $0.00 for uploading >> these huge amounts of data where it would usually charge. >> >> So to conclude, using the loaded data sets isn't free, but I think we >> have to be somewhat appreciative of a value here, right? Amazon is >> providing a service that is ultimately pegged to usage (utility model), >> and the usage comes down to value associated with that scarce resource >> called time. >> >> >>> Ie Can you give me a clue how to get at the data without using my credit >>> >>> >> card please? :-) >> >> >>> >>> >>> >> You can't you will need someone to build an EC2 service for you and eat >> the costs on your behalf. Of course such a service isn't impossible in a >> "Numerati" [1] economy, but we aren't quite there yet, need the Linked >> Data Web in place first :-) >> >> Links: >> >> 1. http://tinyurl.com/64gsan >> >> Kingsley >> >> >>> Best >>> Hugh >>> >>> On 05/12/2008 02:28, "Kingsley Idehen" <kidehen@openlinksw.com> wrote: >>> >>> >>> >>> Hugh Glaser wrote: >>> >>> >>> >>>> Exciting stuff, Kingsley. >>>> I'm not quite sure I have worked out how I might use it though. >>>> The page says that hosting data is clearly free, but I can't see how to >>>> >>>> >> get at it without paying for it as an EC2 customer. >> >> >>>> Is this right? >>>> Cheers >>>> >>>> >>>> >>>> >>> Hugh, >>> >>> No, shouldn't cost anything if the LOD data sets are hosted in this >>> particular location :-) >>> >>> >>> Kingsley >>> >>> >>> >>>> Hugh >>>> >>>> >>>> On 01/12/2008 15:30, "Kingsley Idehen" <kidehen@openlinksw.com> wrote: >>>> >>>> >>>> >>>> All, >>>> >>>> Please see: <http://aws.amazon.com/publicdatasets/> ; potentially the >>>> final destination of all published RDF archives from the LOD cloud. >>>> >>>> I've already made a request on behalf of LOD, but additional requests >>>> from the community will accelerate the general comprehension and >>>> awareness at Amazon. >>>> >>>> Once the data sets are available from Amazon, database constructions >>>> costs will be significantly alleviated. >>>> >>>> We have DBpedia reconstruction down to 1.5 hrs (or less) based on >>>> Virtuoso's in-built integration with Amazon S3 for backup and >>>> restoration etc.. We could get the reconstruction of the entire LOD >>>> cloud down to some interesting numbers once all the data is situated in >>>> an Amazon data center. >>>> >>>> >>>> -- >>>> >>>> >>>> Regards, >>>> >>>> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >>>> President & CEO >>>> OpenLink Software Web: http://www.openlinksw.com >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> -- >>> >>> >>> Regards, >>> >>> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >>> President & CEO >>> OpenLink Software Web: http://www.openlinksw.com >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >> > > > -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Received on Tuesday, 24 March 2009 03:07:20 UTC