Re: Potential Home for LOD Data Sets - is it Open? from Kingsley Idehen on 2008-12-07 (public-lod@w3.org from December 2008)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sun, 07 Dec 2008 13:24:20 -0500
To: Hugh Glaser <hg@ecs.soton.ac.uk>
CC: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <493C14D4.3070905@openlinksw.com>
Hugh Glaser wrote:
> Thanks Kingsley.
> Getting there, I think/hope.
> So exactly what is the URI?
> I run something like
> select *
> where
>  {
>    ?s ?p "ZNF492"
>  }
> and get back things like http://purl.org/commons/record/ncbi_gene/57615, but
> these are not URIs in the Amazon cloud, and so if that is where I was
> serving my Linked Data from, they are not right.
>   
Hugh,

You are experimenting with a work in progress.

DBpedia on EC2 is the showcase for now.

<http://kingsley.idehen.name/resource/Linked_Data> will de-reference 
locally and produce triples that connect to 
<http://dbpedia.org/resource/Linked_Data>.

But note, <http://kingsley.idehen.name> is currently serving up 
neurocommons data while finish what I am doing with neurocommons.
> Would it look something like
> http://ec2-67-202-37-125.compute-1.amazonaws.com/record/ncbi_gene/57615 or
> something else?
>   
Yes, and like the DBpedia example link back to the public neurocommons URIs.
> Or is it just that neurocommons is not offering resolvable URIs on the EC2
> (if I understand the term), but they could switch on something (in
> Virtuoso?) that would give me back resolvable URIs on Amazon?
>   
The instance on EC2 will do what I stated above once we are done with 
the de-referencing rules construction and verification etc..

> And I am also now wondering who pays when I curl the Amazon URI?
> It can't be me, as I have no account.
>   
The person or organization deploying the linked data pays the bill for 
the computing resources by amazon and the linked management and 
deployment services from Virtuoso.

> It isn't the person who put the data there, as you said it was being hosted
> for free.
>   
Be careful here, the hosting issue was simply about an additional home 
for the RDF data set archives. The place from with Quad / Triple stores 
load their data en route to Linked Data publishing (via EC2 or somewhere 
else).  In the case of an EC2 AMI the cost of loading from an S3 Bucket 
into an EC2 AMI is minimal (if anything at all) since the data is in the 
same data center as the EC2 AMI.
> I assume that it means that it must be the EC2 owner, who is firing up the
> Virtuoso magic to deliver the RDF for the resolved URI?
>   
Yes, and in the case of Virtuoso you simply have a platform that offers 
Linked Data Deployment with a local de-referencing twist while retaining 
links to original URIs (as per my DBpedia example).

Once I am done with neurocommons I'll temporary put DBpedia and 
Neurocommons on different ports at http://kingsley.idehen.name for demo 
purposes :-)


Kingsley
> Best
> Hugh
>
> On 07/12/2008 03:34, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>
>   
>> Hugh Glaser wrote:
>>     
>>> Thanks Kingsley.
>>> In case I am still misunderstanding, a quick question:
>>>
>>> On 06/12/2008 23:53, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>>> ...
>>>
>>>       
>>>> Linking Open Data Sets on the Web, is about publishing RDF archives with
>>>> the following characteristics:
>>>>
>>>> 1. De-referencable URIs
>>>>
>>>>         
>>> ...
>>>
>>> So if someone decides to follow this way and puts their Linked Data in the
>>> Amazon cloud using this method, can I de-reference a URI to it using my
>>> normal browser or curl it from my machine?
>>>
>>>       
>> Hugh,
>>
>> Absolutely!
>>
>> For instance, a EC2 based instance of DBpedia will do the following:
>>
>> 1. Localize the de-referencing task (i.e. not pass this on to general
>> public instance of DBpedia)
>> 2. Project triples that connect back to the <http://dbpedia.org> via
>> owl:sameAs (*this was basically what Dan was clarifying in our exchange
>> earlier this week*)
>>
>> The fundamental goal is to use Federation to propagate Linked Data
>> (meme, value prop., and business models) :-)
>>
>> btw - Neurocommons is a data set is now live at the following locations:
>>
>> 1. http://kingsley.idehen.name (*temporary as I simply used this to set
>> up the AMI and verify the entire DB construction process)
>> 2. http://ec2-67-202-37-125.compute-1.amazonaws.com/ (*instance set up
>> by Hugh to double-verify what I did*)
>>
>> Neurcommons takes about 14hrs+ to construct under the best of
>> circumstances. The process is now 1.15 hrs and you have your own
>> personal or service specific neurocommons database.
>>
>> Next stop, Bio2Rdf :-)
>>
>>     
>>> Thanks.
>>> Hugh
>>>
>>>
>>>
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>     
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Sunday, 7 December 2008 18:24:59 UTC