W3C home > Mailing lists > Public > public-lod@w3.org > October 2010

Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Thu, 21 Oct 2010 13:45:42 +0100
To: Giovanni Tummarello <giovanni.tummarello@deri.org>, Chris Bizer <chris@bizer.de>
CC: Linked Data community <public-lod@w3.org>
Message-ID: <C8E5F486.1610E%michael.hausenblas@deri.org>

(cutting down lists as cross-posting is against W3C list policy)

I'm in general with Chris. Of course RDFa is/can be used to do Linked Data.
But rather than wasting our time in ranting how bad the world is, how about
just making it a better place?

1. Clearly, we need to motivate why interlinking is  beneficial or at least
offer 3rd party services that do the job for the publishers (if they don't
see the benefit or have other priorities).

2. Again, rather than discussing endlessly about what is fair and what is
not and who should be there and who not and so on ... hey, it's the Web. An
open, free ecosystem where you can just put up your own visualisation,
diagram, stats, etc. - tthe community will then decide how valuable and
useful it is.

/me back to work now; trying to help solve the issues rather than talking
about it in the first place ;)

Correcting one factual error in Gio's post, though:

> So danny ayers has fun linking to dbpedia so he is in there with his
> joke dataset, but you cant credibly bring that argument to large
> retailers so they're left out?

Denny Vrandecic.


Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730

> From: Giovanni Tummarello <giovanni.tummarello@deri.org>
> Date: Thu, 21 Oct 2010 13:12:10 +0100
> To: Chris Bizer <chris@bizer.de>
> Cc: Martin Hepp <martin.hepp@ebusiness-unibw.org>, Thomas Steiner
> <tsteiner@google.com>, Semantic Web community <semantic-web@w3.org>, Linked
> Data community <public-lod@w3.org>, Anja Jentzsch <anja@anjeve.de>,
> semanticweb <semanticweb@yahoogroups.com>, Kingsley Idehen
> <kidehen@openlinksw.com>
> Subject: Re: AW: ANN: LOD Cloud - Statistics and compliance with best
> practices
> Resent-From: Linked Data community <public-lod@w3.org>
> Resent-Date: Thu, 21 Oct 2010 12:12:41 +0000
>> But again: I agree that crawling the Web of Data and then deriving a dataset
>> catalog as well as meta-data about the datasets directly from the crawled
>> data would be clearly preferable and would also scale way better.
>> Thus: Could please somebody start a crawler and build such a catalog?
>> As long as nobody does this, I will keep on using CKAN.
> Hi Chris, all
> I can only restate that within Sindice we're very open to anyone who
> wanted to develop data anlisys apps creating catalogs automatically.
> At the moment a map reduce job a couple of week ago gave an excess of
> 100k independent datasets. How many interlinked etc? to be analyzed.
> Our interest (and the interest of the Semantic Web vision i want to
> sposor) is to make sure RDFa sites are fully included and so are those
> who provide markup which can however be translated in an
> automatic/agreeable way (so no scraping or "sponging") into RDF. (that
> is anything that any23.org can turn into triples)
> If you were indeed interested in running your or developing your
> algorithms in our running dataset no problem, the code can be made
> opensource so it would run on others similarly structured datasets.
> This said yes i think too that in this phase a CKAN like repository
> can be an interesting aggregation point, why not.
>  But i do think the diagram, which made great sense as an example when
> Richard started it is now at risk of providing a disservice
> which is in line which what Martin is making noticed.
> The diagram as it is now kinda implicitly conveys the sense that if
> something is so large then all that matters must be there and that's
> absolutely not the case.
> a) there are plenty of extremely useful datasets is RDF/RDFa etc which
> are not there
> b) the usefulness of being linked is all but a proven fact, so on the
> one hand people might want to "be there" on the other you'd have to do
> pushing toward serious commercial entities (for example) to "link to
> dbpedia" for reasons that arent clear and that hurts your credibility.
> So danny ayers has fun linking to dbpedia so he is in there with his
> joke dataset, but you cant credibly bring that argument to large
> retailers so they're left out?
> this would be ok if the diagram was just "hey its my own thing i set
> my rules" - fine but the fanfare around it gives it a different
> meaning and thus the controversy above.
> .. just tried to put in words what might be a general unspoken feeling..
> Short message recap
> a) ckan - nice why not might be useful but..
> b) generated diagram : we have the data or can collect it so whoever
> is interested in analitics pls let us know and we can work it out
> (matter of fact it turns out most uf us in here are paid by EU for
> doing this in collaborative projects :-) )
> cheers
> Giovanni
Received on Thursday, 21 October 2010 12:46:21 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:21:05 UTC