Re: KIT releases monumental dataset of more than 15 *trillion* triples

Also, we are considering to provide a live data stream, counting upwards in millisecond steps, to which you can subscribe. 

If demand is high we could also provide a second data stream counting downwards.

Best regards,
York

> Am 02.04.2018 um 17:21 schrieb Denny Vrandečić <vrandecic@gmail.com>:
> 
> It is one of the most requested additions, so we are considering how to provide it. As the dataset has roughly 2.5 petabytes, we are wondering whether it makes sense to develop a custom compression algorithm.
> 
> 
>> On Mon, Apr 2, 2018 at 3:37 AM Javier D. Fernández <jfergar83@gmail.com> wrote:
>> Congrats for the achievement! 
>> 
>> Do you have a direct link to download the full dataset? (sorry if I missed it)
>> 
>> Cheers,
>> Javier
>> 
>>> On Mon, Apr 2, 2018 at 10:03 AM, Jean-Marc Vanel <jeanmarc.vanel@gmail.com> wrote:
>>> Thanks for putting my brain child 
>>> http://semantic-forms.cc:9112/

>>> in top position in
>>> Linked Data Browsers !
>>> 
>>> If I may ask, it's better to use the semantic_forms' sandbox instance instead:
>>> http://semantic-forms.cc:9111/

>>> 
>>> because the one on port 9112 is the social network instance, where you could enter your FOAF profile and much more.
>>> 
>>> 
>>> 
>>> 
>>> 2018-04-01 19:31 GMT+02:00 Denny Vrandečić <vrandecic@gmail.com>:
>>>> KIT is proud today to release an extension to an existing dataset, which will increase the size of the dataset by a factor of more than 1000. The widely cited Linked Open Numbers dataset (more than 30 citations) has been updated. Every single triple was regenerated, and even though the size has been dramatically expanded, we remain confident in the quality of every single triple.
>>>> 
>>>> http://km.aifb.kit.edu/projects/numbers/ 
>>>> 
>>>> It has been - on the data today - eight years since the original publication of the Linked Open Numbers dataset. Today, we are proud to announce to increase the size and thus utility of the dataset by three orders of magnitude.
>>>> 
>>>> The page has received a thorough remake, not only refreshing it optically and updating it to display better on mobile devices, but also introducing a number of new features:
>>>> 
>>>> * the previous limit to the first billion natural numbers has been lifted, since the page has in the meantime moved to a 64 bit architecture. We expanded the supported numbers to the first trillion natural numbers, therefore creating 999 billion new entities.
>>>> 
>>>> * all links to Wikipedia and DBpedia have been refreshed. In the eight years since the original release, Wikipedia and DBpedia have in an effort to catch up with Linked Open Numbers created new entities for numerous numbers. We have updated the links to all of those.
>>>> 
>>>> * also links to Wikidata entities representing these numbers have been created and added, extending the linkage between Linked Open Numbers and the LOD cloud by thousands and thousand of new entities.
>>>> 
>>>> * the whole dataset is now published under the terms of the CC-0 license, countering long years of discussion that resulted in fear, uncertainty, and doubt. Now the Linked Open Numbers dataset is standing on a solid grounding, joining other major datasets in choosing the perfect license for data.
>>>> 
>>>> * we expanded the ontology and the dataset to also provide the digit sum of the numbers, allowing new applications on top of that.
>>>> 
>>>> * we refreshed the links to Linked Data browsers. The original six browsers are all not available anymore to allow to browse over the Linked Open Numbers dataset. Therefore these links were all removed, and replaced with two current browsers.
>>>> 
>>>> * we also support the URI4URI project and providing data about the Linked Open Numbers URIs in the URI4URI scheme.
>>>> 
>>>> * the page has been updated to support Unicode's UTF8, thus showing the number names in their new full glory.
>>>> 
>>>> Eight years - 2922 days - after the original publication Linked Open Numbers still gets tens of thousand hits per month. We are happy to have updated the resource and expanded its lifetime considerably.
>>>> 
>>>> The community is invited and challenged to provide a SPARQL endpoint to the dataset. We think that the size of the dataset would provide for an interesting challenge.
>>>> 
>>>> An open source release of the code base is being planned.
>>>> 
>>>> The update was created in collaboration by Denny Vrandecic, Steffen Thoma, Andreas Thalhammer, Andreas Harth, and York Sure-Vetter.
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Jean-Marc Vanel
>>> http://www.semantic-forms.cc:9111/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me#subject

>>> Déductions SARL - Consulting, services, training,
>>> Rule-based programming, Semantic Web
>>> +33 (0)6 89 16 29 52
>>> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui

>> 
>> 
>> 
>> -- 
>> Javier D. Fernández García
>> jfergar83(at)gmail.com

Received on Monday, 2 April 2018 18:42:44 UTC