W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: KIT releases 14 billion triples to the Linked Open Data cloud

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 1 Apr 2010 12:41:28 +0200
Message-ID: <n2peb19f3361004010341k21da4a8cw942f33c32d6dfcd5@mail.gmail.com>
To: Matthias Samwald <samwald@gmx.at>
Cc: Denny Vrandecic <denny.vrandecic@kit.edu>, public-lod@w3.org
But I love it :) Do the numbers include dates?

Dan

On Thu, Apr 1, 2010 at 12:30 PM, Matthias Samwald <samwald@gmx.at> wrote:
> Hi Denny,
>
> I am sorry, but I have to voice some criticism of this project. Over the
> past two years, I have become increasingly wary of the excitement over large
> numbers of triples in the LOD community. Large numbers of triples don't mean
> don't necessarily mean that a dataset enables us to do anything novel or
> significantly useful. I think there should be a shift from focusing on
> quantity to focusing on quality and usefulness.
>
> Now the project you describe seems to be well-made, but it also exemplifies
> this problem to a degree that I have not seen before. You basically
> published a huge dataset of numbers, for the sake of producing a large
> number of triples. Your announcement mainly emphasis on how huge the dataset
> is, and the corresponding paper does the same. The paper gives a few
> application scenarios, I quote
>
> "The added value of the paradigm shift initiated by our work cannot be
> underestimated.
> By endowing numbers with an own identity, the linked open data cloud
> will become treasure trove for a variety of disciplines. By using elaborate
> data
> mining techniques, groundbreaking insights about deep mathematical
> correspondences
> can be obtained. As an example, using our sample dataset, we were able
> to discover that there are signi cantly more odd primes than even ones, and
> even more excitingly a number contains 2 as a prime factor exactly if its
> successor does not."
>
> I am sorry, but this  sounds a bit overenthusiastic. I see no paradigm
> shift, and I also don't see why your findings about prime numbers required
> you to publish the dataset as linked data. I also have troubles seeing the
> practical value of looking at the resource pages for each number with a
> linked data browser, but I am also not a mathematician.
>
> I am sorry for being a bit antagonistic, but we as a community should really
> try not to be seduced too easily by publishing ever-larger numbers of
> triples.
>
> Cheers,
> Matthias Samwald
>
>
>
>
> --------------------------------------------------
> From: "Denny Vrandecic" <denny.vrandecic@kit.edu>
> Sent: Thursday, April 01, 2010 12:01 PM
> To: <public-lod@w3.org>
> Subject: KIT releases 14 billion triples to the Linked Open Data cloud
>
>> We are happy to announce that the Institute AIFB at the KIT is releasing
>> the biggest dataset until now to the Linked Open Data cloud. The Linked Open
>> Numbers project offers billions of facts about natural numbers, all readily
>> available as Linked Data.
>>
>> Our accompanying peer-reviewed paper [1] gives further details on the
>> background and implementation. We have integrated with external data sources
>> (linking DBpedia to all their 335 number entities) and also directly link to
>> the best-known linked open data browsers from the page.
>>
>> You can visit the Linked Open Numbers project at:
>> <http://km.aifb.kit.edu/projects/numbers/>
>>
>> Or point your linked open data browser directly at:
>> <http://km.aifb.kit.edu/projects/numbers/n1>
>>
>> We are happy to have increased the amount of triples on the Web by more
>> than 14 billion triples, roughly 87.5% of the size of linked data web before
>> this release (see paper for details). We hope that the data set will find
>> its serendipitous use.
>>
>> The data set and the publication mechanism was checked pedantically, and
>> we expect no errors in the triples. If you do find some, please let us know.
>> We intend to be compatible with all major linked open data publication
>> standards.
>>
>> About the AIFB
>>
>> The Institute AIFB (Applied Informatics and Formal Description Methods) at
>> KIT is one of the world-leading institutions in Semantic Web technology.
>> Approximately 20 researchers of the knowledge management research group are
>> establishing theoretical results and scalable implementations for the field,
>> closely collaborating with the sister institute KSRI (Karlsruhe Service
>> Research Institute), the start-up company ontoprise GmbH, and the Knowledge
>> Management group at the FZI Research Center for Information Technologies.
>> Particular emphasis is given to areas such as logical foundations, Semantic
>> Web mining, ontology creation engineering and management, RDF data
>> management, semantic web search, and the implementation of interfaces and
>> tools. The institute is involved in many industry-university co-operations,
>> both on a European and a national level, including a number of intelligent
>> Web systems case studies.
>>
>> Website: <http://www.aifb.kit.edu>
>>
>> About KIT
>>
>> The Karlsruhe Institute of Technology (KIT) is the merger of the former
>> Universität Karlsruhe (TH) and the former Forschungszentrum Karlsruhe. With
>> about 8000 employees and an annual budget of 700 million Euros, KIT is the
>> largest technical research institution within Germany. KIT is both, a state
>> university with research and teaching and, at the same time, a large-scale
>> research institution of the Helmholtz Association. KIT has a strong
>> reputation as one of Germany’s university of excellence, aiming to set the
>> highest standards for education, research and innovation.
>>
>> Website: <http://www.kit.edu>
>>
>> [1] Denny Vrandecic, Markus Krötzsch, Sebastian Rudolph, Uta Lösch:
>> Leveraging Non-Lexical Knowledge for the Linked Open Data Web, published in
>> Rodolphe Héliot and Antoine Zimmermann (eds.), The Fifth RAFT'2010), the
>> yearly bilingual publication on nonchalant research, available at
>> <http://km.aifb.kit.edu/projects/numbers/linked_open_numbers.pdf>=
>
>
>
Received on Thursday, 1 April 2010 10:42:02 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC