Re: [HELP] Can you please update information about your dataset? from Bernard Vatant on 2009-08-14 (public-lod@w3.org from August 2009)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Fri, 14 Aug 2009 16:08:38 +0200
To: Richard Cyganiak <richard@cyganiak.de>
CC: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <4A856FE6.1020406@mondeca.com>

Richard, all

I've done my homework and added a voiD description of lingvoj.org 
dataset at http://www.lingvoj.org/void
It's still minimal, but at least got stats. Links stuff to be added ASAP.
For those who might care, note that it links to a new FOAF profile at 
http://www.lingvoj.org/foaf.rdf

Bernard

Richard Cyganiak a écrit :
> The problem at hand is: How to get reasonably accurate and up-to-date 
> statistics about the LOD cloud?
>
> I see three workable methods for this.
>
> 1. Compile the statistics from voiD descriptions published by 
> individual dataset maintainers. This is what Hugh proposes below. 
> Enabling this is one of the main reason why we created voiD. There has 
> to be better tools for creating voiD before this happens. The tools 
> could be, for example, manual entry forms that spit out voiD 
> (voiD-o-matic?), or analyzers that read a dump and spit out a skeleton 
> voiD file.
>
> 2. Hand-compile the statistics by watching public-lod, trawling 
> project home pages, emailing dataset maintainers, and fixing things 
> when dataset maintainers complain. This is how I created the original 
> LOD cloud diagram in Berlin, and after I left Berlin, Anja has done a 
> great job keeping it up to date despite its massive growth. We will 
> continue to update it on a best-effort basis for the foreseeable 
> future. A voiD version of the information underlying the diagram is in 
> the pipeline. Others can do as we did.
>
> 3. Anyone who has a copy of a big part of the cloud (e.g. OpenLink and 
> we at Sindice) can potentially calculate the statistics. This is 
> non-trivial because we just have triples, and we need to 
> reverse-engineer datasets and linksets from them, it involves 
> computation over quite serious amounts of data, and in the end you 
> still won't have good labels or homepages for the datasets. While this 
> approach is possible, it seems to me that there are better uses of 
> engineering and research resources.
>
> There is a fourth process that, IMO, does NOT work:
>
> 4. Send an email to public-lod asking "Everyone please enter your 
> dataset in this wikipage/GoogleSpreadsheet/fancyAppOfTheWeek."
>
> Best,
> Richard
>


-- 

*Bernard Vatant
*Senior Consultant
Vocabulary & Data Engineering
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com>
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>
----------------------------------------------------**

Received on Friday, 14 August 2009 14:09:43 UTC