Re: Source file of the LOD connectivity patterns from Frederick Giasson on 2009-03-18 (public-lod@w3.org from March 2009)

From: Frederick Giasson <fred@fgiasson.com>
Date: Wed, 18 Mar 2009 08:44:31 -0400
To: marko@lanl.gov
Cc: public-lod@w3.org
Message-id: <49C0ECAF.5050805@fgiasson.com>

Hi Marko,

> Thank you for your thorough response. Yes--it looks like an LOD voID
> instance is the thing to have.
>
> I originally created my csv of the LOD cloud by painstakingly going from
> the OmniGraffle PNG to a csv file by hand (see a visualization of that csv
> file in [1]). If I had the original OmniGraffle source, I could ensure
> accuracy in this manual method by removing links in the Ommnigraffle
> representation as I created them in the csv representation. If someone
> could provide me that Omnigraffle source I could create a clean LOD csv
> file and provide the group with it for their interests much like Frederick
> did with the UMBEL work.
>   

Ok good. Then you will have to ask Richard or Chris (someone else?) for 
this raw Omnigraffle graph.

But what is important at that point, is figuring out what are the 
criterias that created that graph. How to choose what dataset to what 
other dataset. I think this is important because multiple different 
graphs can be created out of the linkage between these datasets. Also, 
each of these different graphs can leads to different outcomes and values.

About your last email with Kingsley (the one that talk about Void) you said:

"Moreover, and to exacerbate the problem, pull-based HTTP is not the 
best mechanism for walking a graph. Again (as I've said in other work), 
LOD following the WWW paradigm will be limited in the sophistication of 
the type of algorithms (graph or otherwise) that can be practically 
executed on it."

I think we have to take care here. As you know, RDF is basically used to 
describe things. The fact that a resource is accessible on the Web (URI 
dereferencing for example), is only a mean for easy accessibility of 
description of "things". However, I am not sure I would suggest to 
create graph analysis methods by fetching all the data, piece by piece, 
on the Web using URI dereferencing.

What I would do, is to try to get the whole dataset as a standalone 
archive file. If not available, in the worse case, I would dereference 
all URIs on the Web (this is a big task that could take weeks; but well, 
this is the worse case scenario, no? :) )

So, the goal is to aggregate all this linkage on a local server, and 
then using and query that data to create the data source for your 
software(s) that perform these graph analysis algorithms. This is like 
what we have done for Cytoscape: we get RDF data and queried the graph 
to create a CSV file ingestible by Cytoscape. For us, RDF is nothing 
more than the canonical form of data description that can then be use at 
any sauces.


Take care,


Fred

Received on Wednesday, 18 March 2009 12:43:30 UTC