- From: Yves Raimond <yves.raimond@gmail.com>
- Date: Fri, 21 Nov 2008 17:01:11 +0000
- To: "Jim Hendler" <hendler@cs.rpi.edu>
- Cc: "Michael Hausenblas" <michael.hausenblas@deri.org>, public-lod@w3.org
Hello! > I guess I asked the question wrong - the linked open data project currently > identifies a specific set of dat resources that are linked together - so > thie "entity" is definable - I didn't mean to ask how big the whole > Semantic Web is - I meant how many triples are in this particular group - > the set that are described on > http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData Here are some stats, updated from a paper we wrote with Tom, Michael and Wolfgang [1]. It doesn't include all of the datasets added in the last revision of the diagram though (it lacks LinkedMDB, for example). http://moustaki.org/resources/lod-stats.png (sorry for the png, I ll upload that in a handier format soonish). \mu is just the size of the dataset in triples. \nu is the |L| * 100 / mu , where L is the set of triples linking to an external dataset.. Overall, that's about 17 billion. Cheers! y [1] http://sw-app.org/pub/isemantics08-sotsw.pdf > I've been able to download pictures of this graph every few months or so, > and you can see the number of datasets growing, but the last published > number of triples for the thing (as stated on that page) is from over a year > ago, and a whole bunch of stuff has been added and some of these have grown > a lot - so we have a publicly shared, large-scale, RDF data resource that > can be used for benchmarking, trying different interfaces and new > technologies, etc > So it would be really nice to get a number every now and then so we could > plot growth, explain to people what is in it better, etc. > I know, I know, I know all the technical reasons this is relatively > meaningless, but I gotta tell you, when I hear someone say "20 billion > triples," I can tell you it it causes people to pay attention -- problem is > I would like to use a number that has some validity before I start quoting > it.... > > On Nov 20, 2008, at 5:12 AM, Michael Hausenblas wrote: > >> My 2c in order to capture this for others as well: >> >> http://community.linkeddata.org/MediaWiki/index.php?HowBigIsTheDangedThing >> >> Cheers, >> Michael >> >> ---------------------------------------------------------- >> Dr. Michael Hausenblas >> DERI - Digital Enterprise Research Institute >> National University of Ireland, Lower Dangan, >> Galway, Ireland >> ---------------------------------------------------------- >> >> Jim Hendler wrote: >>> >>> So I've been to a number of talks lately where the size of the current >>> (Sept 08 diagram) Linked Open Data cloud, in triples, has been stated - with >>> numbers that vary quite widely. The esw wiki says 2B triples as of 2007, >>> which isn't very useful given the growth we've seen in the past year -- I've >>> also seen the various blog posts and mail threads saying why we shouldn't >>> cit meaningless numbers and such - but frankly, I've recently been on a >>> bunch of panels with DB guys, and I'd love to have a reasonable number to >>> quote -- anyone have a good estimate of the size of the danged thing (number >>> of triples in the whole as an RDF graph would be nice) -- would also be nice >>> for general audiences where big numbers tend to impress and for research >>> purposes (for example, we know how far we can compress the triples for an in >>> memory approach we are playing with, but we want to figure out how much >>> memory we need for the whole cloud - we want to know if we need to shell out >>> for the 16G iphone) >>> anyway, if anyone has a decent estimate, or even a smart educated guess, >>> I'd love to hear it >>> JH >>> "If we knew what we were doing, it wouldn't be called research, would >>> it?." - Albert Einstein >>> Prof James Hendler http://www.cs.rpi.edu/~hendler >>> Tetherless World Constellation Chair >>> Computer Science Dept >>> Rensselaer Polytechnic Institute, Troy NY 12180 > > "If we knew what we were doing, it wouldn't be called research, would it?." > - Albert Einstein > > Prof James Hendler > http://www.cs.rpi.edu/~hendler > Tetherless World Constellation Chair > Computer Science Dept > Rensselaer Polytechnic Institute, Troy NY 12180 > > > > > >
Received on Friday, 21 November 2008 17:01:47 UTC