Re: KIT releases monumental dataset of more than 15 *trillion* triples

It is one of the most requested additions, so we are considering how to
provide it. As the dataset has roughly 2.5 petabytes, we are wondering
whether it makes sense to develop a custom compression algorithm.


On Mon, Apr 2, 2018 at 3:37 AM Javier D. Fernández <jfergar83@gmail.com>
wrote:

> Congrats for the achievement!
>
> Do you have a direct link to download the full dataset? (sorry if I missed
> it)
>
> Cheers,
> Javier
>
> On Mon, Apr 2, 2018 at 10:03 AM, Jean-Marc Vanel <jeanmarc.vanel@gmail.com
> > wrote:
>
>> Thanks for putting my brain child
>> http://semantic-forms.cc:9112/
>> in top position in
>> Linked Data Browsers !
>>
>> If I may ask, it's better to use the semantic_forms' sandbox instance
>> instead:
>> http://semantic-forms.cc:9111/
>>
>> because the one on port 9112 is the social network instance, where you
>> could enter your FOAF profile and much more.
>>
>>
>>
>>
>> 2018-04-01 19:31 GMT+02:00 Denny Vrandečić <vrandecic@gmail.com>:
>>
>>> KIT is proud today to release an extension to an existing dataset, which
>>> will increase the size of the dataset by a factor of more than 1000
>>> <http://km.aifb.kit.edu/projects/numbers/web/n1000>. The widely cited Linked
>>> Open Numbers <http://km.aifb.kit.edu/projects/numbers/> dataset (more
>>> than 30 <http://km.aifb.kit.edu/projects/numbers/web/n30> citations)
>>> has been updated. Every single triple was regenerated, and even though the
>>> size has been dramatically expanded, we remain confident in the quality of
>>> every single triple.
>>>
>>> http://km.aifb.kit.edu/projects/numbers/
>>>
>>> It has been - on the data today - eight
>>> <http://km.aifb.kit.edu/projects/numbers/web/n8> years since the
>>> original publication of the Linked Open Numbers dataset. Today, we are
>>> proud to announce to increase the size and thus utility of the dataset by
>>> three <http://km.aifb.kit.edu/projects/numbers/web/n3> orders of
>>> magnitude.
>>>
>>> The page has received a thorough remake, not only refreshing it
>>> optically and updating it to display better on mobile devices, but also
>>> introducing a number of new features:
>>>
>>> * the previous limit to the first billion
>>> <http://km.aifb.kit.edu/projects/numbers/web/n1000000000> natural
>>> numbers has been lifted, since the page has in the meantime moved to a
>>> 64 <http://km.aifb.kit.edu/projects/numbers/web/n64> bit architecture.
>>> We expanded the supported numbers to the first trillion natural numbers,
>>> therefore creating 999 billion
>>> <http://km.aifb.kit.edu/projects/numbers/web/n999000000000> new
>>> entities.
>>>
>>> * all links to Wikipedia and DBpedia have been refreshed. In the eight
>>> years since the original release, Wikipedia and DBpedia have in an effort
>>> to catch up with Linked Open Numbers created new entities for numerous
>>> numbers. We have updated the links to all of those.
>>>
>>> * also links to Wikidata entities representing these numbers have been
>>> created and added, extending the linkage between Linked Open Numbers and
>>> the LOD cloud by thousands and thousand of new entities.
>>>
>>> * the whole dataset is now published under the terms of the CC-0
>>> license, countering long years of discussion that resulted in fear,
>>> uncertainty, and doubt. Now the Linked Open Numbers dataset is standing on
>>> a solid grounding, joining other major datasets in choosing the perfect
>>> license for data.
>>>
>>> * we expanded the ontology and the dataset to also provide the digit sum
>>> of the numbers, allowing new applications on top of that.
>>>
>>> * we refreshed the links to Linked Data browsers. The original six
>>> <http://km.aifb.kit.edu/projects/numbers/web/n6> browsers are all not
>>> available anymore to allow to browse over the Linked Open Numbers dataset.
>>> Therefore these links were all removed, and replaced with two
>>> <http://km.aifb.kit.edu/projects/numbers/web/n2> current browsers.
>>>
>>> * we also support the URI4
>>> <http://km.aifb.kit.edu/projects/numbers/web/n4>URI project and
>>> providing data about the Linked Open Numbers URIs in the URI4URI
>>> <http://uri4uri.net/> scheme.
>>>
>>> * the page has been updated to support Unicode's UTF8
>>> <http://km.aifb.kit.edu/projects/numbers/web/n8>, thus showing the
>>> number names in their new full glory.
>>>
>>> Eight <http://km.aifb.kit.edu/projects/numbers/web/n8> years - 2922
>>> <http://km.aifb.kit.edu/projects/numbers/web/n2922> days - after the
>>> original publication Linked Open Numbers still gets tens of thousand
>>> <http://km.aifb.kit.edu/projects/numbers/web/n40000> hits per month. We
>>> are happy to have updated the resource and expanded its lifetime
>>> considerably.
>>>
>>> The community is invited and challenged to provide a SPARQL endpoint to
>>> the dataset. We think that the size of the dataset would provide for an
>>> interesting challenge.
>>>
>>> An open source release of the code base is being planned.
>>>
>>> The update was created in collaboration by Denny Vrandecic, Steffen
>>> Thoma, Andreas Thalhammer, Andreas Harth, and York Sure-Vetter.
>>>
>>>
>>
>>
>> --
>> Jean-Marc Vanel
>>
>> http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me#subject
>> <http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me>
>> Déductions SARL - Consulting, services, training,
>> Rule-based programming, Semantic Web
>> +33 (0)6 89 16 29 52 <+33%206%2089%2016%2029%2052>
>> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui
>>
>
>
>
> --
> Javier D. Fernández García
> jfergar83(at)gmail.com
>

Received on Monday, 2 April 2018 16:06:33 UTC