W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: KIT releases 14 billion triples to the Linked Open Data cloud

From: John Erickson <olyerickson@gmail.com>
Date: Thu, 1 Apr 2010 09:01:08 -0400
Message-ID: <u2nb813a3fb1004010601paef47270x6a4bfe5d124e7757@mail.gmail.com>
To: public-lod@w3.org
RE Figure 1: *Finally* we have an update to the "July 2009" Web of
Data diagram!!!

Great work!!

On Thu, Apr 1, 2010 at 8:43 AM, Denny Vrandecic <denny.vrandecic@kit.edu> wrote:
> No, that is left for future work (as said in the paper).
> Cheers,
> denny
> On Apr 1, 2010, at 12:41, Dan Brickley wrote:
>> But I love it :) Do the numbers include dates?
>> Dan
>> On Thu, Apr 1, 2010 at 12:30 PM, Matthias Samwald <samwald@gmx.at> wrote:
>>> Hi Denny,
>>> I am sorry, but I have to voice some criticism of this project. Over the
>>> past two years, I have become increasingly wary of the excitement over large
>>> numbers of triples in the LOD community. Large numbers of triples don't mean
>>> don't necessarily mean that a dataset enables us to do anything novel or
>>> significantly useful. I think there should be a shift from focusing on
>>> quantity to focusing on quality and usefulness.
>>> Now the project you describe seems to be well-made, but it also exemplifies
>>> this problem to a degree that I have not seen before. You basically
>>> published a huge dataset of numbers, for the sake of producing a large
>>> number of triples. Your announcement mainly emphasis on how huge the dataset
>>> is, and the corresponding paper does the same. The paper gives a few
>>> application scenarios, I quote
>>> "The added value of the paradigm shift initiated by our work cannot be
>>> underestimated.
>>> By endowing numbers with an own identity, the linked open data cloud
>>> will become treasure trove for a variety of disciplines. By using elaborate
>>> data
>>> mining techniques, groundbreaking insights about deep mathematical
>>> correspondences
>>> can be obtained. As an example, using our sample dataset, we were able
>>> to discover that there are signi cantly more odd primes than even ones, and
>>> even more excitingly a number contains 2 as a prime factor exactly if its
>>> successor does not."
>>> I am sorry, but this  sounds a bit overenthusiastic. I see no paradigm
>>> shift, and I also don't see why your findings about prime numbers required
>>> you to publish the dataset as linked data. I also have troubles seeing the
>>> practical value of looking at the resource pages for each number with a
>>> linked data browser, but I am also not a mathematician.
>>> I am sorry for being a bit antagonistic, but we as a community should really
>>> try not to be seduced too easily by publishing ever-larger numbers of
>>> triples.
>>> Cheers,
>>> Matthias Samwald
>>> --------------------------------------------------
>>> From: "Denny Vrandecic" <denny.vrandecic@kit.edu>
>>> Sent: Thursday, April 01, 2010 12:01 PM
>>> To: <public-lod@w3.org>
>>> Subject: KIT releases 14 billion triples to the Linked Open Data cloud
>>>> We are happy to announce that the Institute AIFB at the KIT is releasing
>>>> the biggest dataset until now to the Linked Open Data cloud. The Linked Open
>>>> Numbers project offers billions of facts about natural numbers, all readily
>>>> available as Linked Data.
>>>> Our accompanying peer-reviewed paper [1] gives further details on the
>>>> background and implementation. We have integrated with external data sources
>>>> (linking DBpedia to all their 335 number entities) and also directly link to
>>>> the best-known linked open data browsers from the page.
>>>> You can visit the Linked Open Numbers project at:
>>>> <http://km.aifb.kit.edu/projects/numbers/>
>>>> Or point your linked open data browser directly at:
>>>> <http://km.aifb.kit.edu/projects/numbers/n1>
>>>> We are happy to have increased the amount of triples on the Web by more
>>>> than 14 billion triples, roughly 87.5% of the size of linked data web before
>>>> this release (see paper for details). We hope that the data set will find
>>>> its serendipitous use.
>>>> The data set and the publication mechanism was checked pedantically, and
>>>> we expect no errors in the triples. If you do find some, please let us know.
>>>> We intend to be compatible with all major linked open data publication
>>>> standards.
>>>> About the AIFB
>>>> The Institute AIFB (Applied Informatics and Formal Description Methods) at
>>>> KIT is one of the world-leading institutions in Semantic Web technology.
>>>> Approximately 20 researchers of the knowledge management research group are
>>>> establishing theoretical results and scalable implementations for the field,
>>>> closely collaborating with the sister institute KSRI (Karlsruhe Service
>>>> Research Institute), the start-up company ontoprise GmbH, and the Knowledge
>>>> Management group at the FZI Research Center for Information Technologies.
>>>> Particular emphasis is given to areas such as logical foundations, Semantic
>>>> Web mining, ontology creation engineering and management, RDF data
>>>> management, semantic web search, and the implementation of interfaces and
>>>> tools. The institute is involved in many industry-university co-operations,
>>>> both on a European and a national level, including a number of intelligent
>>>> Web systems case studies.
>>>> Website: <http://www.aifb.kit.edu>
>>>> About KIT
>>>> The Karlsruhe Institute of Technology (KIT) is the merger of the former
>>>> Universität Karlsruhe (TH) and the former Forschungszentrum Karlsruhe. With
>>>> about 8000 employees and an annual budget of 700 million Euros, KIT is the
>>>> largest technical research institution within Germany. KIT is both, a state
>>>> university with research and teaching and, at the same time, a large-scale
>>>> research institution of the Helmholtz Association. KIT has a strong
>>>> reputation as one of Germany’s university of excellence, aiming to set the
>>>> highest standards for education, research and innovation.
>>>> Website: <http://www.kit.edu>
>>>> [1] Denny Vrandecic, Markus Krötzsch, Sebastian Rudolph, Uta Lösch:
>>>> Leveraging Non-Lexical Knowledge for the Linked Open Data Web, published in
>>>> Rodolphe Héliot and Antoine Zimmermann (eds.), The Fifth RAFT'2010), the
>>>> yearly bilingual publication on nonchalant research, available at
>>>> <http://km.aifb.kit.edu/projects/numbers/linked_open_numbers.pdf>=

John S. Erickson, Ph.D.
Twitter: @olyerickson
Received on Thursday, 1 April 2010 13:01:42 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:04 UTC